Jeff Chan jeffc@surbl.org writes:
Your arguments make perfect sense in terms of SA development and use, but hopefully my reasons for asking also make some sense.
Not really once you consider than most of the listed URLs in our ham are actually spam domains being discussed in the context of anti-spam efforts. Looking at URLs in the body is never going to have a perfect S/O ratio -- even non-developers discuss spam, forward funny spam, innocently link to interesting content on primarily spamvertized sites, etc.
I guess it's time to stop asking. :-(
There are plenty of ways for you to get whitelist URLs without relying on our corpora.
On Friday, April 30, 2004, 7:00:40 PM, Daniel Quinlan wrote:
Jeff Chan jeffc@surbl.org writes:
Your arguments make perfect sense in terms of SA development and use, but hopefully my reasons for asking also make some sense.
Not really once you consider than most of the listed URLs in our ham are actually spam domains being discussed in the context of anti-spam efforts. Looking at URLs in the body is never going to have a perfect S/O ratio -- even non-developers discuss spam, forward funny spam, innocently link to interesting content on primarily spamvertized sites, etc.
OK I see your point that the data in the ham lists may not lend themselves to some sort of mass whitelisting operation.
We'll work on other ways to develop whitelists.
Jeff C.