On Sunday, December 5, 2004, 3:52:31 PM, Daniel Quinlan wrote:
A great source you probably already know about, Wiki URL blacklists, especially the ones edited on a Wiki:
http://moinmaster.wikiwikiweb.de/BadContent http://www.emacswiki.org/cgi-bin/wiki?BannedContent
Not quite sure how these are edited:
http://spammers.chongqed.org/ or http://blacklist.chongqed.org/ http://www.jayallen.org/blacklist.txt
Open source web proxy filter ... maybe willing to share their URL lists? Of course, this is not a spammer list as far as I know, but perhaps it can be used to amplify and verify SURBL whitelist (to eliminate things) or blacklists (to cross-check an addition).
Huge list of URL blacklists:
Worth trying something more elaborate?
Provide the Wiki folks with a better infrastructure for banning URLs used in Wiki spam (which I'm fairly confident will correlate well with email spam).
- Get multiple wikis to use a standard format for bad content lists, feed into a SURBL-based Wiki blacklist.
- All SURBL blacklists can be used on supporting Wikis.
So, SURBL gets a new blacklist (the best kind, one fed with a different type of source), Wikis get a much wider blacklist, etc.
I think new data sources can help the SURBL project if:
1. They have spam URI domains. Some of the wiki or block blacklists may not actually come from spams.
2. They are updated pretty frequently, preferrably several times a day at least.
3. They have false positive rates at least as low as WS.
With theese in mind, would anyone like to help us research some of these other possible sources that Daniel brings up? Multiple opinions could be useful.
Jeff C. -- "If it appears in hams, then don't list it."