[SURBL-Discuss] Re: SURBL scoring (fwd)

Raymond Dijkxhoorn raymond at prolocation.net
Wed Jul 14 18:08:46 CEST 2004


Some talks about SURBL on the MailScanner list. This might be interesting 
for some of you:

---------- Forwarded message ----------
Date: Tue, 13 Jul 2004 17:07:14 -0400
From: John Lundin <lundin at CAVTEL.NET>
Reply-To: MailScanner mailing list <MAILSCANNER at jiscmail.ac.uk>
To: MAILSCANNER at jiscmail.ac.uk
Subject: Re: SURBL scoring

On Mon, 12 Jul 2004, Raymond Dijkxhoorn wrote:
>> If the tests aren't very independent, should I reduce the scores
>> when using more than one test?  We delete mail that scores over 12,
>> with these cumulative scores a false positive could result in lost
>> mail. Should I worry about that?
> They are completely independant. See it as 3 regular RBL checks, if
> you have a open proxy its also listed in all 3 (most likely). If its
> listed in 3, and it scores 12 you are about as positive as it can be
> that its spam...

(cough) Well, since no one else spoke up... IMO, you should worry.
And the problem is about to get worse; there's a new list in beta.

A few days after adding WS to spamcop_uri, I had a friend's letter
wind up in my spam folder. He was building a new computer and had sent
me a parts list for comment. One of his possible suppliers turned out
to be in SC and WC. (You can guess what one of my comments was.)

o Do you really want to lose every message containing the hot URI?
   And any followup that quotes it?

o They wouldn't be completely independent. Similar sets of spammers,
   same URI being matched against in the message.

Personally, I do worry about forcing high-scoring spam status based
on any single content feature. I scored the RBI_URL checks fairly low
(3.0), and added a few meta-rules to soften multiple impact. This was
guess by eyeball. I haven't gotten around to playing with the math,
but have started to keep statistics to base new scores on.

FWIW, I maintain MS on one old spam-ridden site. About 95% of its
inbound mail currently scores as spam. 83% of that spam hits at least
one URI_RBL rule. 31% of spam (37% of spam hits with URI_RBL's) hit
all four of AB, OB, SC and WS, and 53% (63%) hit three or more! Of the
"non-spam", 1.4% still has at least one URI_RBL hit.

What I added to spamcop_uri.cf (first pass):

describe OB_SC_URI_RBL  Compensate if both spamcop and OB trigger
score OB_SC_URI_RBL     -1.5

describe AB_SC_URI_RBL  Compensate if both AB and SC trigger
score AB_SC_URI_RBL     -1.5

describe OB_WS_URI_RBL  Compensate if both WS and OB trigger
score OB_WS_URI_RBL     -1.0

I'd be interested to know what other people do to fix this.

   lundin at cavtel.net
  "By the time they had diminished from 50 to 8,
the other dwarves began to suspect 'Hungry' ..."

-------------------------- MailScanner list ----------------------
To leave, send    leave mailscanner    to jiscmail at jiscmail.ac.uk
Before posting, please see the Most Asked Questions at
http://www.mailscanner.biz/maq/     and the archives at

More information about the Discuss mailing list