On Monday, December 6, 2004, 10:12:43 PM, Jeff Chan wrote:
On Sunday, December 5, 2004, 3:52:31 PM, Daniel Quinlan wrote:
A great source you probably already know about, Wiki URL blacklists, especially the ones edited on a Wiki:
Apparently wiki abuse.
Wiki abuse based on about 1.1k jayallen plus a 2000 more updates.
Not quite sure how these are edited:
http://spammers.chongqed.org/ or http://blacklist.chongqed.org/
Blog and wiki abuse.
Blog abuse.
Open source web proxy filter ... maybe willing to share their URL lists? Of course, this is not a spammer list as far as I know, but perhaps it can be used to amplify and verify SURBL whitelist (to eliminate things) or blacklists (to cross-check an addition).
GPLed Linux software for content filtering; no specific spam filtering as far as I can tell. Charges money for blocklist updates, but probably can be extracted from 150 mb program to get daily updates of filter data. Unclear if it's worth pursuing.
I think new data sources can help the SURBL project if:
- They have spam URI domains. Some of the wiki or
block blacklists may not actually come from spams.
These all pretty much fail my first criteria, that they should be about email spam, but then Daniel said they were mostly wiki and blog data....
While there may be some overlap between wiki and blog versus email spam, the last time I checked the jayallen data there wasn't much overlap and there were probably too many FPs for our purposes.
I'd rather concentrate on sources of spam-specific URI data.
Jeff C. -- "If it appears in hams, then don't list it."