On Wed, Sep 22, 2004 at 04:45:34PM -0700, Jeff Chan wrote:
OK We would like to proceed with the first part of this. We propose adding JP to multi on Monday September 27, but keeping the JP data in WS for now, and asking SA to add JP to SA 3.1 after that change next Monday. We would want to announce the new list so that other programs using multi.surbl.org would know that the return values had changed. That would give them some time if they need to make adjustments to their code. JP would get the 64 bitmask [...]
I'm going to assume a lack of comments means everyone agrees....
I don't disagree, but do have a couple of comments.
First, when JP drops out of WS there will be a content change. One of the reasons for adding JP is to get it a higher SpamAssassin score. But since it was part of WS before that, there will be a "decrease" there. And the folk doing scoring won't have a way to anticipate the effect. Would it be worthwhile to phase JP out of WS slowly and/or put up a temporary WSONLY list that could be used for scoring trials?
The other is more about how people use scores. As we do a better job of spotting and reduce FPs the SpamAssassin scores will go up. This is good, right? Well, maybe. There are six URIRL's in SpamAssassin 3.0 already. And as scored, a -single- feature in the text of the message can trigger a spam score of 9.9 (without bayes) or 12.4 (with). Now. This scares me, since some systems discard spam above a certain score.
If we assume that JP gets the same confidence that SC has, that inflates the score to 13.8 or 16.6. That's a lot of certainty to invest in one lone URI. Especially given that evil URIs do wind up in legitimate mail, however rarely.
Which isn't directly SURBL's problem, of course.