[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: possible mass bug filing: spamassassin 3



Tollef Fog Heen [u] wrote on 10/10/2004 13:01:

* Sven Mueller
From the front page of spamassassin.org:

: Flexible: SpamAssassin encapsulates its logic in a well-designed,
: abstract API so it can be integrated anywhere in the email
: stream. The Mail::SpamAssassin classes can be used on a wide variety
: of email systems including procmail, sendmail, Postfix, qmail, and
: many others.

You are right. This has changed since I last checked (two years ago or so, not sure when). I should have re-checked this before posting.

| > [splitting SA into libmail-spamassassin-perl and spamassassin]
| | Well, in that case, libmail-spamassassin-perl would be the size of the
| current spamassassin package, and the new spamassassin package (which
| depends on the libmail-spamassassin-perl package) is about 2k in size,
| description and packaging overhead included. Sorry, that doesn't make
| much sense.

: tfheen@yiwaz ~ > for f in $(dpkg -L spamassassin | grep -v perl \
  |grep -v man3 ); do [ -f $f ] && echo $f; done | xargs du -shc  |
  tail -1
1,1M    totalt

SA currently ships nearly 600k of rules.

???? I don't understand what you are trying to say. If you yre trying to say that libmail-spamassassin-perl wouldn't include the rules, but spamassassin would, I would like to propose splitting into 3 packages: libmail-spamassassin-perl, libmail-spamassassin-perl-rules and spamassassin, so that Programs relying on libmail-spamassassin-perl can also depend upon the rules package without depening on the spamassassin package. Also note that we actually already have 2 packages: spamassassin and spamc. But what I meant to say is that it doesn't make much sense to split the spamassasin package into several packages: neither the perl modules nor spamassassin itself would be useful without the rules. So you would need to include the rules in the modules-package (or a third package, upon which the lib package would probably depend). After you did that, the spamassassin executable doesn't add much to that package (29k to be precise). As a side note: SA3 ships with 452k of rules if you count /etc/spamassassin/local.cf and 448k if not: mail2:/tmp# dpkg -L spamassassin| grep -E 'etc|usr/share/spamassassin' | grep .cf| xargs du -hc| tail -1
452K    total
mail2:/tmp# dpkg -L spamassassin| grep -E 'usr/share/spamassassin' | grep .cf| xargs du -hc| tail -1
448K    total

You didn't exclude all man pages and you didn't exclude start scripts and configuration ;-)

soname is here used a bit loosely meaning «ABI/API version»; this is
technically not correct (as you point out), but it's shorter than
writing «ABI/API version» all over the place.

OK.

(And, given that perl modules can be normal shared objects, they
certainly _can_ have sonames proper, but I agree that's not the norm.)

In a way, yes. But this is only true for binary modules.

They can try to import Mail::SpamAssassin3 first, if that fails, try
Mail::SpamAssassin.  A nice thing with this is you actually know what
API you use.

Yes.

| spampd for example has a total of 10 lines which differentiate between
| versions v being < 2.7, 2.7 <= v < 3.0 and v >= 3.0 _and_ do what's
| needed to work with either of the three possible categories of
| SpamAssassin versions. If SpamAssasin v3 would be renamed to
| Mail::SpamAssassin3, the changes would be more like 120 lines.

BEGIN {
      eval {
           require Mail::SpamAssassin3;
           import Mail::SpamAssassin3 qw(foo bar baz);
      }
      if ($@) {
         require Mail::SpamAssassin;
         import Mail::SpamAssassin qw(foo bar baz);
      }
}

Doesn't look like 120 lines to me.

Problem is that SA doesn't work well with that sort of namespace mangling. At least most programs which I looked at using SA modules use it in a object oriented way (if you can call it that). So they have a multitude of lines referring to Mail::SpamAssassin.

| [SA3 API published half a year ago] This is orthagonal to the discussion -- how much and when the API
changed doesn't mean it shouldn't be done right.

Well, yes.

This is Debian. We don't break stuff arbitrarily.

I.e. "We try not to break stuff arbitrarily." ;-)
Problem with SA3 is that by renaming Mail::SpamAssassin to Mail::SpamAssassin3 for SA3 makes it difficult for many programs to adjust. Especially because this introduces a new modulename which isn't used on any other platform, causing it to be a debian-only change to the programs. Not renaming it breaks some programs, which had months to adjust to the new API (upstream) with that adjustment being a pretty small change. Also, the adjustment needs to be made in upstream versions anyway. I certainly don't want to see SA3 enter the testing/stable archive as "spamassassin"/Mail::SpamAssassin before each and every program which uses it can cope with the change or conflicts version >=3 [1] or is removed from the archive. However, I would like to see SA3 enter the testing/stable archive as "spamassassin" as soon as any program in the archive isn't broken by that introduction _and_ the memory usage problems of SA 3.0 are fixed.

> If you have a package which breaks packages depending on you, it's a
> bug in your package (with exceptions if a package is tinkering with
> some private functionality in your package or taking advantage of a
> bug or undefined behaviour in your package).  It's very easy to fix
> that bug, though:  Conflict with the packages you break.

True. So:
If SA3 conflicts with every package it breaks [1],
has it's memory consumption bug fixed
and has no other critical bug
I personally don't see much of a problem with it entering the archive. _But_ I don't think this has any chance to happen before Sarge is released (sad but true, but I still hope for volatile.d.o).

[1]: If I say SA should conflict another package X, I actually mean that either SA reports a conflict with X or X reports a conflict with SA or both. These should be versioned conflicts.

Ciao,
Sven



Reply to: