[SURBL-Discuss] general questions.....

Chris Santerre csanterre at MerchantsOverseas.com
Tue Nov 23 20:01:31 CET 2004



>-----Original Message-----
>From: Jeff Chan [mailto:jeffc at surbl.org]
>Sent: Monday, November 22, 2004 8:41 PM
>To: SURBL Discussion list
>Subject: Re: [SURBL-Discuss] general questions.....
>
>
>On Monday, November 22, 2004, 5:25:14 PM, Justin Mason wrote:
>> Important to note that SURBL *can* increase its efficiency, 
>by changing
>> its methods -- ie. adding more data sources, modifying the moderation
>> model, etc. can increase efficiency.
>
>I like to think so too, but one of Terry's hypotheses is that
>detecting spam in the remaining variance (the ~15% currently
>undetected) may require some "third dimension of spam" and that
>about half of that variance may be truly "noise" and therefore
>inherently undetectable (paraphrasing him from off-list
>discussions).  But he doesn't have data to support that claim
>yet, just empirical observations across different classification
>systems.
>
>It's good to hear that Henry Stern is getting a PhD for his
>work in this area, since it can be worthy of that honor.
>It's not a particularly easy problem.
>
>Jeff C.

Wow that was a good email. It makes me think about things from a higher
level then the trenches. The whole thing has to be thought of in sections.
If we are thinking of JUST SURBL, then I agree that to get this 15%
remaining requires more manpower thrown at the overall project. I say
overall, because there are other antispam projects that support SURBL that
would also be MUCH better with more help. 

Looking at it from another view, the 15% IS caught! THe bigger picture is
antispam. You throw DNSRBL, SURBL, BAYES, SARE, and SA at the problem, and
classification jumps an order of magnitude that you wanted. Which for most
end users can be 99.99%. Differences being tastes in the definition of the
classification. Which is a human trait that can't be removed. 

But I believe there is still a huge leap SURBL can make in classification.
With an increase in data mining, research, and a little more help from major
ISPs and registrars. 

Thanks for that informative email Jeff!! You saved me a google ;)

--Chris 


More information about the Discuss mailing list