On Thu, Sep 23, 2004 at 06:56:04AM -0700, Jeff Chan wrote:
On Thursday, September 23, 2004, 6:21:13 AM, John Lundin wrote:
The other is more about how people use scores. As we do a better job of spotting and reduce FPs the SpamAssassin scores will go up. This is good, right? Well, maybe. There are six URIRL's in SpamAssassin 3.0 already. And as scored, a -single- feature in the text of the message can trigger a spam score of 9.9 (without bayes) or 12.4 (with). Now. This scares me, since some systems discard spam above a certain score.
Are the scores cumulative like that? I thought I heard they are either/or, perhaps in the context of multi and urirhssub.
Oooh, yeah. And they usually do go off in multiples.
Some percentages from a small ISP, last two months inbound mail:
Detected 4.616% as not spam (including FFP's): 99.144% (no URI_RBL found) 0.488% WS_URI_RBL 0.226% OB_URI_RBL 0.051% OB_URI_RBL WS_URI_RBL 0.037% SPAMCOP_URI_RBL 0.017% OB_URI_RBL SPAMCOP_URI_RBL 0.012% SPAMCOP_URI_RBL WS_URI_RBL 0.012% OB_URI_RBL SPAMCOP_URI_RBL WS_URI_RBL 0.005% AB_URI_RBL OB_URI_RBL SPAMCOP_URI_RBL 0.005% AB_URI_RBL OB_URI_RBL 0.002% AB_URI_RBL
Detected 95.384% as spam: 34.538% AB_URI_RBL OB_URI_RBL SPAMCOP_URI_RBL WS_URI_RBL 14.623% (no URI_RBL found) 14.359% OB_URI_RBL WS_URI_RBL 10.442% OB_URI_RBL SPAMCOP_URI_RBL WS_URI_RBL 7.551% WS_URI_RBL 3.153% AB_URI_RBL SPAMCOP_URI_RBL WS_URI_RBL 3.031% AB_URI_RBL OB_URI_RBL SPAMCOP_URI_RBL 3.006% OB_URI_RBL 2.681% AB_URI_RBL OB_URI_RBL WS_URI_RBL 1.936% SPAMCOP_URI_RBL WS_URI_RBL 1.648% OB_URI_RBL SPAMCOP_URI_RBL 1.105% AB_URI_RBL WS_URI_RBL 1.055% AB_URI_RBL OB_URI_RBL 0.340% SPAMCOP_URI_RBL 0.340% AB_URI_RBL SPAMCOP_URI_RBL 0.172% AB_URI_RBL 0.010% AB_URI_RBL PH_URI_RBL SPAMCOP_URI_RBL WS_URI_RBL 0.005% PH_URI_RBL WS_URI_RBL 0.004% OB_URI_RBL PH_URI_RBL WS_URI_RBL 0.001% AB_URI_RBL PH_URI_RBL WS_URI_RBL 0.001% PH_URI_RBL SPAMCOP_URI_RBL WS_URI_RBL 0.000% PH_URI_RBL
Over a third of all spam inbound hit all four URIRLs. Less that half of that number hit no URIRLs. But even less, only 11.069%, hit just one URIRL.
Under SA2.6, I compensated by adding in second-order meta rules with negative scores, but as the number of urirls goes up that becomes unwieldy fast.
If we assume that JP gets the same confidence that SC has, that inflates the score to 13.8 or 16.6. That's a lot of certainty to invest in one lone URI. Especially given that evil URIs do [...]
JP should score about the same as OB since they have similar spam detection and FP rates. SC has a lower FP rate (good) and somewhat lower hit rates (less good) than JP or OB. The lower FP rate rightly counts more, so SC scores higher.
That would drop it to 11.9 or 15.6. :-)
I worry most about quoting and notification scenarios.