[SURBL-Discuss] This ROCKS!

Chris Santerre csanterre at merchantsoverseas.com
Fri Apr 30 14:59:07 CEST 2004

>-----Original Message-----
>From: jm at jmason.org [mailto:jm at jmason.org]
>Sent: Friday, April 30, 2004 1:10 PM
>To: SURBL Discussion list
>Cc: SpamAssassin-dev at incubator.apache.org
>Subject: Re: [SURBL-Discuss] This ROCKS!
>Hash: SHA1
>Doc Schneider writes:
>>I am on my days off and hope to get some type of stats 
>gathering ditty 
>>going. But not sure what stats folks want to see. I had some 
>>conversations with Chris "The Big Evil One" and he had some ideas.
>>Am just not sure what direction to pursue.
>>I did join the dev list for SA and sort of got talked into 
>checking the 
>>docs for SA 3.0 which I'm slowly working on. Just checking them for 
>>proper API coverage in the different modules it is using.
>>So my question to you folks is what sort of statistics would 
>you like to 
>>see? Number of rules hit, even if a ham (I always get hungry when 
>>talking about ham and spam but I digress) or just total 
>number and the 
>>actual rules that are being hit for all spam? And isn't there 
>>already in SA that does these rules and hits? corporra(sic 
>and too lazy 
>>to look).
>Hi Doc -- cc'ing sa-dev, since it's really an SpamAssassin thing
>rather than a SURBL thing ;)
>I'm not sure if you mean measuring rule accuracy in advance to 
>pick good
>scores, or reporting stats after the fact for sysadmins.
>For the first one, read:
>  http://wiki.apache.org/spamassassin/MassCheck
>  http://wiki.apache.org/spamassassin/HitFrequencies
>For the second:
>We recently added some additional stats output to spamd in SpamAssassin
>3.0.0.  This should improve the accessibility of info about rules
>being used during scanning for tools to summarise.
>- --j.

Hey there guys! This was the crazy idea I was discussing with Doc. I wished
for a realtime form of DB or flat file to be updated continuously on rule
hits. No grepping thru logs or anything. Simply when an email is sent thru
SA, whatever rules hit, increase a counter in a db or flat file for that
rule. Seperate db or flat file for ham and spam. This gives live stats on a
system. No grep'n going on. Just a counter per rule.

This is to be used on some advanced rule writing we want to work on. It also
alows an admin to see what might not be worth keeping around. Allowing them
to remove poor performers and increase system speed. 


Chris Santerre 
System Admin and SARE Ninja
'It is not the strongest of the species that survives,
not the most intelligent, but the one most responsive to change.'
Charles Darwin Charles Darwin 

More information about the Discuss mailing list