From jm@jmason.org Wed Jun 16 20:39:51 2004
From: jm@jmason.org
To: discuss@lists.surbl.org
Subject: [SURBL-Discuss] proxypots
Date: Wed, 16 Jun 2004 11:39:26 -0700
Message-ID: <20040616183927.4CECE590006@radish.jmason.org>
In-Reply-To: <1068018218.20040616015831@supranet.net>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="===============6518460581490341008=="
--===============6518460581490341008==
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Jeff Chan writes:
>We are also looking into some other potential spam
>URI data sources such as proxypots, etc.:
>
> http://proxypot.org/
Jeff --
a quick note on this; it has to be done very carefully. Many spammers are
using "link poisoning" stuff like this:
Get over 300 medicatlons online shlpped overnight to your front door with no prescrlption.
All of those are "www.{RANDOMWORD}.{com|net|org}". Eventually there's
one real link, which *is* SURBL-listed. These are chaff.
Now, SORBS for one seems to be listing some of these sites; presumably
because they have a spamtrap-driven feed without enough human moderation.
That's the danger here.
(btw, there's arguments to be made that a better selection mechanism
can "weed those out", but that needs to be careful too.
- - Ignore .org/.net/.com? spammer will use .biz, .info, and ccTLDs.
- - Ignore 0-length links ()? spammer will change
to use {RANDOMWORD}.
- - Ignore "dictionary words" somehow? spammer will use random URLs
from google, so "real" sites.
so I don't think those approaches have much merit alone.)
- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS
iD8DBQFA0JPeQTcbUG5Y7woRAnYYAJ9/fZaT3WLmU+gT8aAnT2rcduDo7QCg6BE1
dF1r9ciWtFpEdC4OBHdRSKE=
=mnKX
-----END PGP SIGNATURE-----
--===============6518460581490341008==--