I was planning on giving Yahoo! more time to correct their "Geocities Spam" problem before I released my plugin to deal with it, but I've been noticing a decline in the scores these mails are getting.
I also just found out that I have copies of this sort of spam going back to at least December 28, 2004 and have been getting them in volume since May 2005. I had thought it only went back to September and not back an entire year with increasing volume (10%+ of my spam is now "geocities spam") in the last six months. In my opinion they've had sufficient time to act.
Further, while adding some documentation to the plugin, I tested some of the spam I used to write the plugin back in September and found that some of the "member sites" are still active.
Conveniently, there are only a few versions of the pages linked to, so writing rules against them is pretty effective -- which is what this plugin is for.
A few words of caution if you do decide to use this plugin:
- While I believe there are no issues with the code, I'm not too familiar with LWP::UserAgent, so it's entirely possible that I have missed something. In the event your machine gets rooted, you've been warned.
- Query the links found in an email inherently has a number of privacy and technical issues you should be aware of. The plugin attempts to avoid them by stripping visible query strings and login credentials, but I encourage you to read the WARNING section of the plugin's perldoc before using it. Be sure to NEVER use this plugin to query links hosted on a server the sender may control.
- High volume sites would be wise to run this behind a caching HTTP proxy such as Squid to reduce the 0.3 to 1 second that it may take to query each link. While the web query is blocking, it takes place just after the DNS requests are kicked off, so it gives the DNS queries more time to complete which may result in DNSBL hits that may have been missed due to timeouts.
- The scores assigned to the rules are guesses on my part based on what they match. I have no legitimate email to compare hits against. I recommend monitoring the hits for some period of time and reassigning scores if necessary or not to your liking.
The plugin is available at: http://wiki.apache.org/spamassassin/WebRedirectPlugin
Send me an email if you find the plugin useful or spot a flaw that should be corrected.
Best Regards,
Daryl C. W. O'Shea