[SURBL-Discuss] Re: RFC: consensus list?

Jeff Chan jeffc at surbl.org
Fri Nov 12 14:07:41 CET 2004

On Friday, November 12, 2004, 4:00:47 AM, David Hooton wrote:
> On Fri, 12 Nov 2004 02:45:15 -0800, Jeff Chan <jeffc at surbl.org> wrote:
>> Pondering the question of how to make a "telco grade" SURBL that
>> had as close to zero false positives as possible, but would still
>> catche many spams, I remembered that many of the biggest spam
>> domains seem to appear in several different SURBL lists.
>> What does anyone think about creating a "consensus" list
>> that a telco or ISP might use to block at the MTA level?
>> For example a domain that appears on:
>>  ((SC or AB) and (JP or OB)) or PH

> I think the percentile based lists are probably the best way to go -
> ie. top 50% of all requested surbl listed domains or something like
> that?

Percentiles are good, but they're only possible when you have
frequencies of reports, queries, etc.  The only list I have
report frequencies for is SC, so it's not possible for me to
compare percentiles across other lists.

One thing we could take percentiles on is DNS queries, and
that could be useful, but it doesn't exclude FPs.  If we
didn't whitelist w3.org for example, it would have lots of
DNS query FPs.  Frequencies of DNS query hits against
blocklists could get us an approximation of the "top
spammers" with some possible FPs included among the most
frequent queries.

> We should probably work on developing some more diverse spamtrap
> feeds.  Quite a lot of ISP's have well established spamtraps that they
> are either not using or are completely underutilising.

> Lists like SC, AB and JP all seem to be good data sources, but if you
> were trying to be certain of 0 FP's you'd need something to reliably
> and continuously rebuild your data against and from.

More traps and more data are definitely desirable, but we're
also interested in seeing if we can make smarter use of the
existing data, so thanks for your suggestions.

Jeff C.
"If it appears in hams, then don't list it."

More information about the Discuss mailing list