[SURBL-Discuss] general questions.....
jeffc at surbl.org
Tue Nov 23 02:40:43 CET 2004
On Monday, November 22, 2004, 5:25:14 PM, Justin Mason wrote:
> Important to note that SURBL *can* increase its efficiency, by changing
> its methods -- ie. adding more data sources, modifying the moderation
> model, etc. can increase efficiency.
I like to think so too, but one of Terry's hypotheses is that
detecting spam in the remaining variance (the ~15% currently
undetected) may require some "third dimension of spam" and that
about half of that variance may be truly "noise" and therefore
inherently undetectable (paraphrasing him from off-list
discussions). But he doesn't have data to support that claim
yet, just empirical observations across different classification
It's good to hear that Henry Stern is getting a PhD for his
work in this area, since it can be worthy of that honor.
It's not a particularly easy problem.
"If it appears in hams, then don't list it."
More information about the Discuss