[SURBL-Discuss] RFC: pj.surbl.org - list from Joe Wein and Pr olocation data

Raymond Dijkxhoorn raymond at prolocation.net
Sun Sep 19 11:11:20 CEST 2004


Hi!`

>> Remember that the PJ records are already in multi, as
>> part of WS

> That's cheating.  If the WS bit is set I'd expect a WS
> entry, with the WS policy and whitelisting instructions.
>
> Sure, at the moment there are no different whitelisting
> instructions for the MULTI sets, but that's not obvious.
> And sooner or later it will change.

There is generic whitelisting, on *ALL* SURBL lists, and thats done on a 
central level. That will be the most important mask, since al lists walk 
by.

>> I actually wanted the JW data to be separate in the
>> beginning because it was a distinctly different and new
>> data source with different a inclusion process, different
>> spamtrap feeds, etc.

> If it's really very different, then it's also good enough
> for its own MULTI bit.  But a different set of spamtraps
> is no real difference.  A different policy for inclusions,
> exclusions, or whitelisting is interesting.

The dataset is much smaller, still seems to have less FP rates, Theo (SA) 
and some others, including mysel,  did large checks, and found out the 
same.

Todays stats, but thats only from 11 hours real life data:

SpamAssassin tag hits: (top 100)
#1	53053	URIBL_WS_SURBL
#2	51711	URIBL_PJ_SURBL
#3	51702	URIBL_SBL
#4	49008	BAYES_99
#5	48227	URIBL_OB_SURBL
#6	45620	RCVD_IN_BL_SPAMCOP_NET
#7	45489	HTML_MESSAGE
#8	35014	URIBL_SC_SURBL
#9	29758	URIBL_AB_SURBL
#10	27992	MIME_HTML_ONLY

The WS stats are still the combined lists, i also did tests with a special 
zonefile, compiled for this test, where PJ data was taken out of WS. There 
PJ performed better then the whole WS. That was my main reason to propose 
a seperate list. Its smaller, catches more then the combined list, and has 
a lower FP rating then the combined list.

Bye,
Raymond.


More information about the Discuss mailing list