[SURBL-Discuss] RFC: SURBL inclusion policy

Ryan Thompson ryan at sasknow.com
Sun Sep 26 23:22:02 CEST 2004


Ryan Thompson wrote to SURBL Discussion list:

>> XBL is an excellent list of spam senders, by far the biggest catcher
>> of spam senders in my regular RBLs, so it probably would be good as a
>> header check for GetURI also.  Ryan can we make this a feature
>> request?
>
> Sure. Now it's making sense. :-) Fortunately, adding header checks
> will be easy, because I'm already using the SpamAssassin engine.

OK, I've tried this, but it slows down the runs considerably, and my 2K
test corpus had 54 RCVD_IN_XBL hits, but for some reason, *none*
of those messages contained domains that were not already listed in
SURBL. The run took 26 minutes, instead of the usual 2-3m for the 2K
corpus.

Then, I used the new --surbl=hostname option to only check against WS
only (instead of the default multi), and found only 2/381 (0.5%) domains
spamvertised by an XBL listed host.

Hmm. Then I fed the --surbl option a local "dummy" SURBL list containing
only test entries, effectively disabling the SURBL filter in GetURI, and
have 52/3130 (1.6%) domains whose message was RCVD_IN_XBL.

So, I think, given the low hit rate (especially in the usual case of
only looking for new SURBL domains), and the tremendous amount of extra
time required to do the XBL header/net test (the last run took 48
minutes, compared to ~16 minutes without the header tests), so I'm going
to make GetURI default to *not* doing the header checks, and let people
enable them with the new --header option.

With all of these new DNS tests, network delays are now definitely the
bottleneck in GetURI. Soon (not for 1.6, maybe 1.7), I think I'm going
to have to go to a forked or threaded model.

- Ryan

-- 
   Ryan Thompson <ryan at sasknow.com>

   SaskNow Technologies - http://www.sasknow.com
   901-1st Avenue North - Saskatoon, SK - S7K 1Y4

         Tel: 306-664-3600   Fax: 306-244-7037   Saskatoon
   Toll-Free: 877-727-5669     (877-SASKNOW)     North America


More information about the Discuss mailing list