I have seen your discussion about blacklisting geocities. This would create so many FP it's clearly impossible, but I wonder if treating these domains as 3rd / 4th level TLD could be the way to go ...
Example:
This (-munged) real spammy address http://uk.geocities-munged.com/Gonzalo_Freehling/ could be translated into gonzalo_freehling.uk.geocities-munged.com and then queried ...
This is along the same lines as a suggestion I made a few months ago, to come up with a standardized format for checking partial or complete URLs against a blocklist. That would definitely allow targeting spammers who abuse free or cheap web hosting sites and URLs to redirect to their real sites.
Personally, I'd prefer a protocol that used the actual URL, perhaps base64-encoded, to this, for a couple of reasons. First, this format only works for sites that use a URL structure similar to Geocities -- they don't all. Second (and IMHO more important), there are a whole lot of phish sites and hacked/trojaned sites with downloadable viruses or trojans that could be targeted precisely, without false positives, if you had a format that accomodated complete or nearly complete URLs of any format.
But I like your idea. :)