Re: [SURBL-Discuss] Proposal for moving forward with JP list

23 Sep 2004


      On Thu, Sep 23, 2004 at 06:56:04AM -0700, Jeff Chan wrote:
...
On Thursday, September 23, 2004, 6:21:13 AM, John Lundin wrote:
...
The other is more about how people use scores. As we do a better job
of spotting and reduce FPs the SpamAssassin scores will go up. This is
good, right?  Well, maybe. There are six URIRL's in SpamAssassin 3.0
already. And as scored, a -single- feature in the text of the message
can trigger a spam score of 9.9 (without bayes) or 12.4 (with). Now.
This scares me, since some systems discard spam above a certain score.
Are the scores cumulative like that?  I thought I heard they
are either/or, perhaps in the context of multi and urirhssub.
Oooh, yeah. And they usually do go off in multiples.
Some percentages from a small ISP, last two months inbound mail:
Detected 4.616% as not spam (including FFP's):
99.144%  (no URI_RBL found) 
 0.488%  WS_URI_RBL
 0.226%  OB_URI_RBL
 0.051%  OB_URI_RBL WS_URI_RBL
 0.037%  SPAMCOP_URI_RBL
 0.017%  OB_URI_RBL SPAMCOP_URI_RBL
 0.012%  SPAMCOP_URI_RBL WS_URI_RBL
 0.012%  OB_URI_RBL SPAMCOP_URI_RBL WS_URI_RBL
 0.005%  AB_URI_RBL OB_URI_RBL SPAMCOP_URI_RBL
 0.005%  AB_URI_RBL OB_URI_RBL
 0.002%  AB_URI_RBL
Detected 95.384% as spam:
34.538%  AB_URI_RBL OB_URI_RBL SPAMCOP_URI_RBL WS_URI_RBL
14.623%  (no URI_RBL found) 
14.359%  OB_URI_RBL WS_URI_RBL
10.442%  OB_URI_RBL SPAMCOP_URI_RBL WS_URI_RBL
 7.551%  WS_URI_RBL
 3.153%  AB_URI_RBL SPAMCOP_URI_RBL WS_URI_RBL
 3.031%  AB_URI_RBL OB_URI_RBL SPAMCOP_URI_RBL
 3.006%  OB_URI_RBL
 2.681%  AB_URI_RBL OB_URI_RBL WS_URI_RBL
 1.936%  SPAMCOP_URI_RBL WS_URI_RBL
 1.648%  OB_URI_RBL SPAMCOP_URI_RBL
 1.105%  AB_URI_RBL WS_URI_RBL
 1.055%  AB_URI_RBL OB_URI_RBL
 0.340%  SPAMCOP_URI_RBL
 0.340%  AB_URI_RBL SPAMCOP_URI_RBL
 0.172%  AB_URI_RBL
 0.010%  AB_URI_RBL PH_URI_RBL SPAMCOP_URI_RBL WS_URI_RBL
 0.005%  PH_URI_RBL WS_URI_RBL
 0.004%  OB_URI_RBL PH_URI_RBL WS_URI_RBL
 0.001%  AB_URI_RBL PH_URI_RBL WS_URI_RBL
 0.001%  PH_URI_RBL SPAMCOP_URI_RBL WS_URI_RBL
 0.000%  PH_URI_RBL
Over a third of all spam inbound hit all four URIRLs.
Less that half of that number hit no URIRLs.
But even less, only 11.069%, hit just one URIRL.
Under SA2.6, I compensated by adding in second-order meta rules with
negative scores, but as the number of urirls goes up that becomes
unwieldy fast.
...
...
If we assume that JP gets the same confidence that SC has, that
inflates the score to 13.8 or 16.6. That's a lot of certainty to
invest in one lone URI. Especially given that evil URIs do [...]
JP should score about the same as OB since they have similar
spam detection and FP rates.  SC has a lower FP rate (good)
and somewhat lower hit rates (less good) than JP or OB.  The
lower FP rate rightly counts more, so SC scores higher.
That would drop it to 11.9 or 15.6. :-)
I worry most about quoting and notification scenarios.
-- 
  lundin@cavtel.net
"ASCII stupid question, get a stupid ANSI."

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [SURBL-Discuss] Proposal for moving forward with JP list