On 9/22/2006 5:35 AM, opencomputing@gmail.com wrote:
Or should I look up the real target of the URL and look it up too? But this would lead to latencies of around 10-20 seconds depending on a 3rd party web server like geocities in the above example :(
You have to be careful in deciding what URLs found in mail are safe to query. Querying all URLs could have bad concequences, such as confirming a subsciption or unsubscription to a list, confirming a transaction that the intended recipient may or may not actually want confirmed or just plain verifying an address.
In any case, Geocities URLs are usually pretty safe to query. I usually strip query parameters from the URLs though.
As for then querying the URLs being redirected to via these web pages, I wouldn't bother. More than half are already javascript encoded, so unless you're looking to run javascript in a sandbox to get the URL you can't safely do it. That and it's not necessary. Content filtering of the web pages works very well. Yahoo! has been good about shutting new ones down lately too, so even just getting a 403 is a big spam sign. If you're using SpamAssassin, there's a plugin to do all this.
Daryl