On Thursday, September 23, 2004, 6:21:13 AM, John Lundin wrote:
On Wed, Sep 22, 2004 at 04:45:34PM -0700, Jeff Chan wrote:
OK We would like to proceed with the first part of this. We propose adding JP to multi on Monday September 27, but keeping the JP data in WS for now, and asking SA to add JP to SA 3.1 after that change next Monday. We would want to announce the new list so that other programs using multi.surbl.org would know that the return values had changed. That would give them some time if they need to make adjustments to their code. JP would get the 64 bitmask [...]
(Thanks for your feedback... :-)
First, when JP drops out of WS there will be a content change. One of the reasons for adding JP is to get it a higher SpamAssassin score. But since it was part of WS before that, there will be a "decrease" there. And the folk doing scoring won't have a way to anticipate the effect. Would it be worthwhile to phase JP out of WS slowly and/or put up a temporary WSONLY list that could be used for scoring trials?
Good point. Raymond has already been testing a version of WS with only WS and no JP. Perhaps we should make one generally available for testing and scoring before the JP out of WS date in some months. I'm already dreading the support questions. LOL!
The other is more about how people use scores. As we do a better job of spotting and reduce FPs the SpamAssassin scores will go up. This is good, right? Well, maybe. There are six URIRL's in SpamAssassin 3.0 already. And as scored, a -single- feature in the text of the message can trigger a spam score of 9.9 (without bayes) or 12.4 (with). Now. This scares me, since some systems discard spam above a certain score.
Are the scores cumulative like that? I thought I heard they are either/or, perhaps in the context of multi and urirhssub.
If we assume that JP gets the same confidence that SC has, that inflates the score to 13.8 or 16.6. That's a lot of certainty to invest in one lone URI. Especially given that evil URIs do wind up in legitimate mail, however rarely.
JP should score about the same as OB since they have similar spam detection and FP rates. SC has a lower FP rate (good) and somewhat lower hit rates (less good) than JP or OB. The lower FP rate rightly counts more, so SC scores higher.
Jeff C.