Hi,
I'm new here, so sorry if this have been previously debated.
I have seen your discussion about blacklisting geocities. This would create so many FP it's clearly impossible, but I wonder if treating these domains as 3rd / 4th level TLD could be the way to go ...
Example:
This (-munged) real spammy address http://uk.geocities-munged.com/Gonzalo_Freehling/ could be translated into gonzalo_freehling.uk.geocities-munged.com and then queried ...
The translation would require some specific code for these 'virtual TLD' domains in URIDNSBL.pm (for SA3) but would allow catching otherwise undetected URI, while having no FP for non spammy geocities sites.
For it to work, the 'virtual TLD' domains should be flagged, either by setting a specific bit in the returned data (during normal queries to the domain) or maintaining a downloadable list.
Does it make sense ?
On Monday, October 10, 2005, 10:11:47 AM, Eric Montréal wrote:
I have seen your discussion about blacklisting geocities. This would create so many FP it's clearly impossible, but I wonder if treating these domains as 3rd / 4th level TLD could be the way to go ...
Example:
This (-munged) real spammy address http://uk.geocities-munged.com/Gonzalo_Freehling/ could be translated into gonzalo_freehling.uk.geocities-munged.com and then queried ...
The translation would require some specific code for these 'virtual TLD' domains in URIDNSBL.pm (for SA3) but would allow catching otherwise undetected URI, while having no FP for non spammy geocities sites.
For it to work, the 'virtual TLD' domains should be flagged, either by setting a specific bit in the returned data (during normal queries to the domain) or maintaining a downloadable list.
Yes, something like that is probably possible to do, but it's not what we designed SURBLs for originally. It would also generally require changing code in applications using SURBLs and the data back end on our side.
Really we're not too interested in "catching" geocities. Since Yahoo is a legitimate company, we expect them take care of their abuse issues, and it seems they are making a renewed effort to do a better job of that.
SURBLs are better for listing spam sites hosted in China, Russia, Brazil, or other places that are hard to get to, that ignore abuse complaints, etc. Typically those are some of the same spammers that use zombies, open proxies, and other indirect methods to send their spams out. They also tend to register very many "disposable" domains, use them for a few days, then abandon them.
They sometimes appear to use keyed or randomized subdomains, paths, ids, hashes, etc., to be able to track which spams were successfully delivered and/or complained about. Publically listing information keyed that way may actually aid the spammers in listwashing or delivery confirmation, and it may create other privacy issues for the spam victims. For example what if the key is some personal ID number or other private information belonging to the spam victim?
The best answer is to get the legitimate hosting and networking companies to police their own services, and that is happening more often.
Cheers,
Jeff C. -- Don't harm innocent bystanders.
I have seen your discussion about blacklisting geocities. This would create so many FP it's clearly impossible, but I wonder if treating these domains as 3rd / 4th level TLD could be the way to go ...
Example:
This (-munged) real spammy address http://uk.geocities-munged.com/Gonzalo_Freehling/ could be translated into gonzalo_freehling.uk.geocities-munged.com and then queried ...
This is along the same lines as a suggestion I made a few months ago, to come up with a standardized format for checking partial or complete URLs against a blocklist. That would definitely allow targeting spammers who abuse free or cheap web hosting sites and URLs to redirect to their real sites.
Personally, I'd prefer a protocol that used the actual URL, perhaps base64-encoded, to this, for a couple of reasons. First, this format only works for sites that use a URL structure similar to Geocities -- they don't all. Second (and IMHO more important), there are a whole lot of phish sites and hacked/trojaned sites with downloadable viruses or trojans that could be targeted precisely, without false positives, if you had a format that accomodated complete or nearly complete URLs of any format.
But I like your idea. :)