On Wednesday, June 16, 2004, 11:39:26 AM, Justin Mason wrote:
a quick note on this; it has to be done very carefully. Many spammers are using "link poisoning" stuff like this:
Get ov<A href="http://www.gimbel.org"></A>er 300 medicat<B><FONT size=3>l</FONT></B>ons online sh<B><FONT size=3>l</FONT></B>pp<A href="http://www.omniscient.com"></A>ed over<A href="http://www.proton.net"></A>nig<A
(btw, there's arguments to be made that a better selection mechanism can "weed those out", but that needs to be careful too.
- Ignore .org/.net/.com? spammer will use .biz, .info, and ccTLDs.
to use <a href=...>{RANDOMWORD}</a>.
- Ignore 0-length links (<a href=...></a>)? spammer will change
No ! O-length links are invisible. RANDOMWORDs or anything with length greater than 0 are visible !
from google, so "real" sites.
- Ignore "dictionary words" somehow? spammer will use random URLs
so I don't think those approaches have much merit alone.)
I think I sent you a little output of my scripts which help me to manual validate URLs. It's enough to list all URLs in the spam, the number of times they appear, and you'll quickly what shall be blacklisted.
Hand-checking could make it feasible.
Yes. The better idea, IMO, is to find the better way to present URLs with some hints and manually validate them to add to the blacklist. This is how I do.
Best
Joe
Jeff C.
Discuss mailing list Discuss@lists.surbl.org http://lists.surbl.org/mailman/listinfo/discuss