[SURBL-Discuss] Discounting invalid TLD's in uri_to_domain

Jeff Chan jeffc at surbl.org
Wed Jun 9 01:28:54 CEST 2004

On Tuesday, June 8, 2004, 11:09:54 PM, Yusuf Goolamabbas wrote:
> I've filed bug 3467 in SA's bugzilla

> http://bugzilla.spamassassin.org/show_bug.cgi?id=3467

> suggesting that uri_to_domain discount URI's which don't end in valid
> TLD's. There are test cases in which SA's get_uri_list can pick up URI
> of the form http://random.gif/ which will return random.gif as the
> domain and get fed into the pool of candidate domains to check for.

> I don't know that SpamCopURI's behaviour is with the testcases I've
> filed

To be honest, I don't know the exact client behavior either, but
philosophically we're original-spam-data-centric.  We tend to
capture whatever URIs are presented, and on occasion those can be
bogus URIs.  But those will likely tend to be in the minority
since using them probably does the spammer little good.

Most of the code on the data and client sides probably doesn't
attempt to determine valid TLDs.  The systems are kept relatively
open-ended to organically deal with variability that occurs
naturally, for example when a new tld is created.  It's
possible that could cause problems, but my take on it is that
things will generally work themselves out.  Spammers should
not have much incentive to load down their messages with broken
URIs.  Of course if this causes any major problems we would
like to know about it.

Jeff C.

More information about the Discuss mailing list