On Friday, June 25, 2004, 1:04:21 AM, Jeff Chan wrote:
On Friday, June 25, 2004, 12:16:17 AM, Justin Mason wrote:
My results, testing ph while I was at it:
: jm 1351...; grep -v " 0.500 " freqs OVERALL% SPAM% HAM% S/O RANK SCORE NAME 121405 22516 98889 0.185 0.00 0.00 (all messages) 100.000 18.5462 81.4538 0.185 0.00 0.00 (all messages as %) 13.453 70.3766 0.4925 0.993 1.00 1.00 SURBL_WS 3.807 20.3811 0.0334 0.998 0.50 1.00 SURBL_SC 2.650 14.2565 0.0071 1.000 0.50 1.00 SURBL_AB 0.019 0.0933 0.0020 0.979 0.50 1.00 SURBL_PH 12.624 67.6275 0.1001 0.999 0.50 1.00 SURBL_OB2 13.295 68.5113 0.7230 0.990 0.00 1.00 SURBL_OB
OB2 looks pretty promising ;)
Thanks much for the data. Yusuf from Outblaze said to use the ob2 data source, so I will switch ob over to it unless there are any results or comments to the contrary.
Judging by Justin's results above the revised data source has nearly the same spam detection rate and a much lower false positive rate, so I agree it's a better list.
My understanding is that the older Outblaze data source reflected in the original ob.surbl.org included spam sender domains used by Outblaze to block message envelopes (headers). The revised list should have only spamvertised site domains (domains from message body URIs). This is more appropriate for SURBL use.
Therefore, I've changed ob.surbl.org to the same, revised Outblaze data source as ob2.surbl.org so that ob should now be about half as large with about 20k domains and have better performance. The ob2 list will go away eventually since it was only meant for testing and now has the same content as ob.
If anyone else is able to give some test results for ab, ob and ob2, please speak up.
Does anyone else have any results of testing ab or ob to share?
Jeff C.