[SURBL-Discuss] Proposing a greylist

Jeff Chan jeffc at surbl.org
Fri Sep 3 08:31:54 CEST 2004

On Friday, September 3, 2004, 6:50:25 AM, John Lundin wrote:
> Maybe there should be a dark multi, with one bit for confirmed
> spammers with some ham, and another for early warning entries.
> It would be nice to be able to evaluate them separately.

> As an analog, SARE splits some rulesets (genlsubj, html, header) into
> `categories of "hit ONLY spam", "have hit ham", and "hit a significant
> amount of ham." You can choose your level of safety and effectiveness.
> (If you want to get fancy, encode a confidence level. Two bits? ;-) )

SARE and SpamAssassin in general have a different approach to
detecting spam than SURBLs.

SA is usually used with elaborate rules and technologies to
categorize spam based on multiple characteristics in headers
and message bodies.  SA was built to cut through some of the
obfuscation of content and sender information that spammers
shifted to when they stopped sending clear text messages from
known mail servers.  Zombies and compounding obfuscation make
that approach a constant challenge.

SURBLs attempt to identify spam by finding exactly those URI
domains which are used in spams.  They cut right to the
unavoidable core of what spammers usually do and that's to
advertise a web site. 

Because the focus of each technology is slightly different,
assumptions made from the perspective of one technology
may not fit the other perfectly.  For example it's not
always the case that SURBLs will be used with programs
that can score messages with different weights for different
rules.  If the false positive rates were low enough, SURBLs
could be used to block messages with just URI parsing,
including in the MTA.  That allows spam to be rejected at
the transport layer without sending it through SpamAssassin,
thus saving much processing time, cpu resources, etc.  MTA
uses of SURBL already exist, though we're still waiting for
sendmail milters and postfix filters.

It was logical for SURBLs to be used with SpamAssassin because
SA provides a nice framework of message parsing, URI extraction,
mail program interfaces, etc. but SURBLs can be used directly
with MTAs and other mail-handling, spam-blocking programs.  In
those cases the classifications need to be extremely accurate.
False positives are the largest obstacle to that use and so
they need to be reduced.

Instead of finding ways to collect greylists full of questionable
domains, we should be trying to find ways to improve the quality
of the existing lists.  That's where the most important and
valuable progress can be made. 

Jeff C.

More information about the Discuss mailing list