[SURBL-Discuss] Re: quick poll on SURBL hit %

Chris Santerre csanterre at MerchantsOverseas.com
Tue Jan 18 18:21:51 CET 2005


I'm replying to both as I don't know if Terry is on the list.....

>-----Original Message-----
>From: Terry Sullivan [mailto:terry at pantos.org]
>Sent: Monday, January 10, 2005 11:55 AM
>To: discuss at lists.surbl.org
>Subject: [SURBL-Discuss] Re: quick poll on SURBL hit %
>
>
>Chris Santerre wrote:
>
>>Just curious as to what average percent of spam people see SURBL 
>>hitting. In a non scientific manor, I average about 85% ...
>
>I've run multiple analyses on historical datasets, and get a 
>consistent 
>*average* of 82%-86%, so 84% is a decent estimate.
>
>The most noteworthy statistical characteristic of the SURBL hit rate 
>over time is the large *variance* in hit rate.  Some days, the 
>SURBL hit 
>rate I observe in my data is in the 60%'s, while other days its in the 
>90%'s.  The fluctuation appears to be at least somewhat periodic in 
>nature (several "low" days in a row, followed by several 
>"high" days).  
>I've not actually run the numbers, but my totally informal, 
>*purely gut* 
>sense is that the magnitude of that variance may have 
>diminished lately, 
>but the periodic pattern persists.  These periodic fluctuations imply 
>that there is probably some systematic cause underlying this variance, 
>and that cause is itself almost certainly periodic in nature.  

That is interesting! I wonder if this has become a Metric for actual spam
traffic? Could it coincide with weekends? Don't suppose you could graph that
data over a 365 day period? 


>
>>I have a feeling if I clean up my
>>results a bit, that number would be even higher. 
>
>I've talked about this with Jeff several times, and he's even shared 
>some of my comments with this list.  No one in the anti-spam 
>world likes 
>hearing this, but there is very strong evidence of a "hard" 
>statistical 
>detection limit right around ~85%.  This limit appears to be more or 
>less independent of data set or detection method.  

Actually Jeff and I have discussed this, and I finally understood it :) I
also agree with the 85% rule of yours. And we seem to be hitting it very
nicely! I'm not sure Bayes even hits that close to 85%! 

--Chris 


More information about the Discuss mailing list