On Fri, 13 Aug 2004, Jeff Chan wrote:
and URI parsing code, which can be non-trivial. Certainly some of the code in SpamAssassin or SpamCopURI could be of interest.
Good point, I'll have to look at those.
(SURBLs are built from data provided by third parties such as SpamCop, Outblaze, SARE, Bill Stearns, Raymond, Joe Wein, etc. As such we don' do any processing of actual spam messages, just the extracted URI contents.)
After expansion of the recipient lists, I get about 150k spamtrap mails per day. A bit much to check the URLs by hand, so I'm looking for a way to automatically extract them and make them available.
I'd be happy to put up a feed for SURBL and other interested parties; I'll take a look into the spamassassin code, if anybody has pointers to other software that may make my task easy, please let me know ;)
Once I have a working script to extract URLs from a spamtrap feed, I'll make it available as free software. Possibly even bundled with Spamikaze ;)
cheers,
Rik