On Friday, July 2, 2004, 6:06:16 AM, Don Newcomer wrote:
Here are my counts since 6:50 PM yesterday for all URI_RBL rules sorted by spam and ham:
URI_RBL spam counts:
3577 - AB_URI_RBL (5.0) - surbl.cf 2499 - DS_URI_RBL (0.33) - surbl.cf 7282 - OB_URI_RBL (4.0) - surbl.cf 4279 - SPAMCOP_URI_RBL (3.0) - surbl.cf 5458 - WS_URI_RBL (3.0) - surbl.cf
URI_RBL ham counts:
231 - DS_URI_RBL (0.33) - surbl.cf 18 - OB_URI_RBL (4.0) - surbl.cf 1 - SPAMCOP_URI_RBL (3.0) - surbl.cf 29 - WS_URI_RBL (3.0) - surbl.cf
Interesting that AB_URI_RBL has no false positives yet... Still, we haven't released spam filtering to our users yet so my Bayes training is based pretty much on all of the SA rulesets' interpretation of spam (which isn't necessarily a bad thing).
Thanks much for the data Don, particularly the false positive hits. Does anyone else have any to share? If so please post them here.
ab.surbl.org is based on SpamCop data plus some manual reports, as is sc.surbl.org, but ab has a different inclusion criteria of taking the top 500 most often reported (less www. duplicates and whitelists hits) over 7 days, whereas sc has an arbitrary inclusion threshold of 10 reports over 4 days. 1 FP for sc is pretty good, though zero is better. :-)
ob is pretty impressive in terms of hit rate and relatively low FP rate, at least as a percentage of hits.
Note that ds.surbl.org (based on 6dos data) is now up on 5 name servers so it may be ok to use on production servers for beta testing.
Please note that I probably won't be able to check email for about a week so hopefully others will help answer SURBL questions, etc.
Cheers,
Jeff C.