I probably should have introduced this second SURBL list
that can be used together with or in place of sc.surbl.org
before mentioning that its name was changing from sa.surbl.org
to ws.surbl.org. :-) Note that the two lists have different
data sources, so strictly speaking one is not a replacement for
the other. They're two different lists. sc uses URI domains
from SpamCop reports. The data source for ws data is described
below. Both lists have merits and we'd encourage you to consider
trying both.
Here's an announcement with the additional update that
we've changed the *sample rule names* for the ws list to use
"WS" instead of "SA":
__
http://www.surbl.org/ (with some live links)
More SURBL lists
In addition to the first SpamCop URI-derived SURBL sc.surbl.org, we
are pleased to host another RBL compatible with the SpamCopURI or
URIDNSBL SpamAssassin plugins, or any other software that can
check message body domains against a name-based RBL. Data for the
second SURBL ws.surbl.org comes from the domains in Bill Stearns'
SpamAssassin blacklist: sa-blacklist. This is a large list of
spam domains, including those found in spam message body URIs.
Both ws.surbl.org and sc.surbl.org SURBLs can be used in the same
SA installation by using two sets of rules.
An SA 2.63 rule and score using SpamCopURI (but not the SpamCop
data!) looks like this:
uri WS_URI_RBL eval:check_spamcop_uri_rbl('ws.surbl.org','127.0.0.2')
describe WS_URI_RBL URI's domain appears in spamcop database at ws.surbl.org
tflags WS_URI_RBL net
score WS_URI_RBL 3.0
An SA 3.0 rule and score using URIBL's urirhsbl looks like this:
urirhsbl URIBL_WS_SURBL ws.surbl.org. A
header URIBL_WS_SURBL eval:check_uridnsbl('URIBL_WS_SURBL')
describe URIBL_WS_SURBL Contains a URL listed in the WS SURBL blocklist
tflags URIBL_WS_SURBL net
score URIBL_WS_SURBL 3.0
More details about ws.surbl.org are available in the section
"Additional SURBLs for spam URI testing" (copied below).
Please note that the name of this list is being changed from
sa.surbl.org to ws.surbl.org. If you were using the old name in
your rules please update them to the new name.
...
Additional SURBLs for spam URI testing
Additional SURBLs that list domains occurring in spam message
bodies may be used with the same routines that use the
sc.surbl.org RBL.
sa-blacklist available as RBL: ws.surbl.org
In cooperation with Bill Stearns, SURBL is making his
sa-blacklist SpamAssassin blacklist available as the RBL
ws.surbl.org. It can be used in the same way as sc.surbl.org, for
example by adding urirhsbl and SpamCopURI rules as described in
the Quick Start section at the top of this document. Like sc,
ws.surbl.org is available through DNS and, for large-volume mail
servers, as rsynced BIND and rbldns zone files. Raymond
Dijkxhoorn has graciously agreed to host the ws.surbl.org zone
files from his rsync server along with sc.surbl.org's. Please
contact him at rsync(a)surbl.org for rsync access.
Both sc and ws RBLs can be used in the same installation. The
choice of using either or both or none is yours. Their data
differs somewhat, and we'll try to briefly describe and link some
of the differences here. Bill's list is rather large at about
9600 domains. It consists of domains found in spam message body
URIs and some spam sender and spam operator domains. Given that
the former are more relevant to isolate these days, most of the
recent additions to Bill's list have been URI domains. Those are
also the domains most relevant for use with the message body
checking approach which we propose throughout this site.
The data in sa-blacklist and therefore ws.surbl.org differ from
the SpamCop URI report data described above in that the list is
about ten times larger, more stable, and may have a slightly
higher false positive rate. Bill's policy for inclusion and
cleaning of the sa-blacklist is quite sound, however, so folks
should feel comfortable giving this list a try in addition to the
sc list. ws may currently detect some spam that sc misses, and
vice versa, but it's worth mentioning that the current sc is a
working prototype and that we expect the performance of sc to
improve as we tune the sc data engine further. sc just got out of
the gate, yet it already has some worthy competition in ws.
Thanks Bill!
Because ws is larger and more stable, the zone files for it gets
a six hour TTL compared to 10 minutes for sc. Due to the
differences between the time scales, sizes, and data sources of
ws and sc, we probably won't be offering a combined ws plus sc
list. For example it would be difficult to say what TTL a merged
list should get, and you probably would not want a megabyte plus
BIND zone file refreshing every 10 minutes. For those using
rsynced zone files that would probably not be an issue, but for
those using BIND, the DNS traffic quite well could be.
We encourage you to give ws.surbl.org a try.
Please note that the name of this list is being changed from
sa.surbl.org to ws.surbl.org. If you were using the old name in
your rules please update them to the new name.
Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/
Given the probable need to improve whitelisting, I've added a
log of domains that would go onto sc.surbl.org but are then
prevented from getting onto the list by the whitelist(s):
http://www.surbl.org/whitelist-hits.new.log
That goes along with the log of new additions to sc.surbl.org,
i.e., essentially a blacklisting log:
http://www.surbl.org/top-sites-domains.new.log
I've also grabbed copy of 500 popular web site domains for
addition to the whitelist. A couple of the recent whitelist hits
have been from it. So far they seem reasonable.
Whitelisting will continue in the next version of the engine,
hopefully with some larger data sets.
Blacklisting based on SpamCop URI domain data will hopefully
be more stable and broader in the next version also. In other
words, there should be significantly less activity on the
blacklist log since the list itself will be more stable.
(For example under the current system you may see some domains
that come off the list then get back on it.... Pay no attention
to the man behind the curtain... :-) There should be a lot less
of that.)
Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/
A question about whether SpamCopURI would support using the
alternative SURBL ws.surbl.org came up, so I thought I'd address
that for everyone. Any program that knows how to extract URIs
from message bodies, then domains from the URIs, then compare
those domains against an RBL can use any or all of the SURBL
lists. Therefore SpamCopURI will work with ws.surbl.org just
fine. (Noting of course that the ws results won't necessarily
be related to the SpamCop-derived data in the sc list.)
All you need to do is add a rule with the name of that list:
uri SA_URI_RBL eval:check_spamcop_uri_rbl('ws.surbl.org','127.0.0.2')
describe SA_URI_RBL URI's domain appears in spamcop database at ws.surbl.org
tflags SA_URI_RBL net
score SA_URI_RBL 3.0
(Likewise in SpamAssassin 3.0 with urirhsbl:)
urirhsbl URIBL_SA_SURBL ws.surbl.org. A
header URIBL_SA_SURBL eval:check_uridnsbl('URIBL_SA_SURBL')
describe URIBL_SA_SURBL Contains a URL listed in the SA SURBL blocklist
tflags URIBL_SA_SURBL net
score URIBL_SA_SURBL 3.0
You can run either SURBL or both if you like. Note that
ws has a higher spam detection rate (currently) but also
a somewhat higher false positive rate than sc. Here's
a corpus check Dan Quinlan ran:
OVERALL% SPAM% HAM% S/O RANK SCORE NAME
11189 1200 9989 0.107 0.00 0.00 (all messages)
100.000 10.7248 89.2752 0.107 0.00 0.00 (all messages as %)
6.095 56.2500 0.0701 0.999 1.00 1.00 URIBL_SC_SURBL
6.855 59.7500 0.5006 0.992 0.98 1.00 URIBL_SBL
9.545 72.8333 1.9421 0.974 0.95 0.01 T_URIBL_SA_SURBL
0.116 0.5000 0.0701 0.877 0.58 0.01 T_URIBL_DSBL
SA_SURBL above reflects the old name for ws; SC_SURBL is
sc.surbl.org. ws detected ~73% of spams in the spam corpus
with a ~1.9% FP rate in the ham corups. sc detected ~56%
with a <0.1% FP rate.
We're still tuning how the SpamCop data is used, so the sc
hit rates should improve and FPs decrease hopefully in the
next version of the sc data engine.
Cheers,
Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org
http://www.jeffchan.com/
Hello SURBL users,
Please note that the name of the SURBL derived from Bill Stearns'
sa-blacklist is being changed from sa.surbl.org to ws.surbl.org .
If you were using the old name in your rules or configs please
update them to the new name.
We will keep DNS queries up on the old name for a week or so but
will probably drop them after that. This is only a name change
for that list. Functionality should remain the same.
Cheers,
Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org
http://www.jeffchan.com/
Devin Carraway has written a plugin for the Perl-based MTA qpsmtpd
to compare domains from message body URIs to SURBL domain
lists. Here's his announcement of what I believe is the first
MTA use of SURBL. Congrats and thanks to Devin!
__
Date: Tue, 13 Apr 2004 02:07:15 -0700
From: Devin Carraway <qpsmtpd(a)devin.com>
Subject: qpsmtpd plugin
Saw today's slashdot article on SURBL -- glad to see someone's taken up
the idea. I had thought of something similar, but somehow hadn't
connected it with "oh yeah, they're already hostnames, make a DNSBL out
of it."
You commented that it'd be nice to see support for it in MTAs, so I
wrote a plugin for qpsmtpd to do it. Qpsmtpd, if you haven't
encountered it, is a replacement smtpd for qmail and postfix, with a
primary emphasis on detecting and declining spam during the initial SMTP
transaction.
http://www.nntp.perl.org/group/perl.qpsmtpd/1216http://devin.com/qpsmtpd/uribl
--
Jeff Chan
mailto:jeffc@surbl.org-nospam
http://www.surbl.org/
SpamCop's Spamvertised sites page is up but not currently
serving data. I've take this opportunity to make sure that
the SURBL engine does the right thing when there's no new data
coming in. When that happens the sc.surbl.org list stays
unchanged except for domains that may come off the list due
to expiration of old reports.
Once the data feed is up again, sc.surbl.org should pick up
where it left off and things should continue to operate normally.
As an aside, the next version of the data engine will have a much
longer memory, especially of spam domains and IP addresses so
there won't be nearly as much churn in the domains. There will
also be more domains on the list.
Cheers,
Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/
Made Slashdot:
http://slashdot.org/
A New Type Of Realtime Blocklist: The SURBL
Posted by timothy on Monday April 12, @05:02PM
from the chicken-egg-spam dept.
Glamdrlng writes "The SURBL, or "Spam URI Realtime Blocklist",
represents a nexus of RBL's and content filtering that may bring
us one step closer to a spam magic bullet. While traditional
RBL's perform a DNS lookup on the connecting mail server, SURBL's
take this a step further by parsing the text of the email looking
for URI's and doing a lookup on those web servers. They also
prevent "joe jobs" by maintaining a whitelist of legitimate web
servers whose domain names may show up in spam messages, e.g.
EBay, Paypal, Microsoft, etc. The only requirement to implement
the SURBL is a plugin on your MTA such as spamassassin that can
parse the body of each email. While there is no MTA that directly
supports SURBL's without a plugin, the author hints at one being
in development."
http://yro.slashdot.org/yro/04/04/12/1956252.shtml?tid=111&tid=126&tid=95
Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/
It may be worth mentioning that I fixed my typo in the text
record. Or not. ;-)
> On Sunday, April 11, 2004, 6:32:49 AM, William Stearns wrote:
>> On Sun, 11 Apr 2004, Jeff Chan wrote:
>>> "Message body contains domain in sa-backlist. See: http://www.stearns.org/sa-blacklist/"
>
>> Looks good, except sa-backlist needs another "l". *smile*
>
> Indeed it does. Fixed. Thanks! LOL!
It now reads:
"Message body contains domain in sa-blacklist. See: http://www.stearns.org/sa-blacklist/"
Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/