From martins@ensmp.fr Wed Jun 16 22:45:41 2004 From: Jose Marcio Martins da Cruz To: discuss@lists.surbl.org Subject: Re: [SURBL-Discuss] proxypots Date: Wed, 16 Jun 2004 22:45:33 +0200 Message-ID: <200406162045.i5GKjX3L025328@ensmp.fr> In-Reply-To: <728573649.20040616121625@supranet.net> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2319679900807895533==" --===============2319679900807895533== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > > On Wednesday, June 16, 2004, 11:39:26 AM, Justin Mason wrote: > > a quick note on this; it has to be done very carefully. Many spammers are > > using "link poisoning" stuff like this: > > > Get ov > href="http://www.gimbel.org">er 300 medicat > size=3>lons online shlpp > href="http://www.omniscient.com">ed over > href="http://www.proton.net">nig > >> (btw, there's arguments to be made that a better selection mechanism > >> can "weed those out", but that needs to be careful too. > >> > >> - - Ignore .org/.net/.com? spammer will use .biz, .info, and ccTLDs. > >> - - Ignore 0-length links ()? spammer will change > >> to use {RANDOMWORD}. No ! O-length links are invisible. RANDOMWORDs or anything with length greater than 0 are visible ! > >> - - Ignore "dictionary words" somehow? spammer will use random URLs > >> from google, so "real" sites. > >> > >> so I don't think those approaches have much merit alone.) I think I sent you a little output of my scripts which help me to manual validate URLs. It's enough to list all URLs in the spam, the number of times they appear, and you'll quickly what shall be blacklisted. > Hand-checking could make it feasible. Yes. The better idea, IMO, is to find the better way to present URLs with some hints and manually validate them to add to the blacklist. This is how I do. Best Joe > > Jeff C. > > _______________________________________________ > Discuss mailing list > Discuss(a)lists.surbl.org > http://lists.surbl.org/mailman/listinfo/discuss > > --===============2319679900807895533==--