On Tuesday, November 23, 2004, 2:18:09 PM, Rob McEwen wrote:
Jeff, I didn't mean to make you have to rehash the standards for SURBL. I totally understood these already and I didn't mean to imply differently in my original post. (But I suppose you have to always be on your guard to prevent misunderstandings. You can never be too careful...)
I appreciate seeing your examples and getting to discuss some of them. It's probably good to discuss some of the things we're all trying to do.
But your answers regarding the corpuses were exactly what I was questioning. Basically, 1 FP in 50,000 is not bad. But if most of these FPs are "white-hat marketer" advertisements (an oxymoron?) or newsletters ...and few of them are actual human-typed correspondence, then this percentage is even better. If the opposite is true, then this might not be quite as good as it sounds.
Yes, getting down in the small fractions of percents is a little like looking for subatomic particles. You never know exactly what you might find when you look there....
Interestingly, I've read some phenomenal and very specific stats from Mail Filtering companies who don't get specific about these kinds of issues mentioned here and I wonder "who are they kidding".
Anyone who will believe them? ;-)
Jeff C. -- "If it appears in hams, then don't list it."