Re: [SURBL-Discuss] ccTLDs and multiple queries

22 Apr 2004


      On Wednesday, April 21, 2004, 3:12:58 PM, Eric Kolve wrote:
...
On Wed, Apr 21, 2004 at 03:00:52PM -0700, Jeff Chan wrote:
...
On Wednesday, April 21, 2004, 12:21:16 PM, John Fawcett wrote:
...
From: "Eric Kolve"
...
Initially, when I released spamcopuri I decided to pretty much ignore
whether the TLD was a country code or not.  This results in about
twice as many queries as necessary, but guaranteed you would get
hits if the domain was listed.
Now that people are pointing this to other RBL's beside just surbl,
should we continue to do second and third level queries? Or just
the query that we assume to be necessary?  My concern is that not
all RBLs will process the domains according to a list such as
http://www.bestregistrar.com/help/ccTLD.htm.  I suppose the worst
case scenario is we end up getting a miss when we should be getting
a hit because one side presumes that say TLD .za has a subdomain 'foo',
when the server doesn't.  The server side would expect a second level,
while
...
the client would do a third level query (this is why I wanted the wildcard
records).  I guess this really isn't that great a consequence considering
the savings and the fact that this shouldn't occur very often.
I will go ahead and make this change if everyone is comfortable with the
known risk.
I think if an rhsbl is listing a second level registry domain
(like .co.uk) then I think it's up to the list maintainer to implement
the wild card so that xxxxx.co.uk returns an A record. I wouldn't
worry about taking into account such an extreme case,
since I cannot imagine any list wanting to do such widespread
blocking.
Yes, the two level ccTDLs like co.uk should never get into a
SURBL.  Only registrar-type domains should, like foo.co.uk.
...
I believe there should be a mechanism which distinguishes whether
a second or third level lookup is required based on a static
lists of domains known to have or not have subdomains.
If nothing is known then the default should be to check both
second and third as at present.
Aha, now I think I understand what's being proposed.
Currently SpamCopURI checks all domains at the second
and third level against a given SURBL, regardless of
whether the domain is in a ccTLD or not.
It sounds like Eric is proposing a change, where if a domain is
in the ccTLD list like co.uk, then the client should try
extract and check a three level domain like foo.co.uk.  Otherwise
it should check two levels like foo.com.
Is that right?  If so it may be ok, though our list of ccTLDs is
slightly underspecified (there are some ccTLDs not in it).  Note
that my ccTLD list:
...
Yes.  This is exactly what I am proposing.
Kewl.  Sounds good to me.  I'm cc'ing the SpamAssassin devlopers
to compare notes on how they're handling ccTLDs in message body
URI checks.
...
...
http://spamcheck.freeapp.net/two-level-tlds
is (derived from but) slightly more complete than the one at
http://www.bestregistrar.com/help/ccTLD.htm ....
Worst case is that we miss a few ccTLDs.  Probably not too big a
deal given that most of the spam domains are .com, .biz, etc.
I believe Eric is also making a finer point that other SURBL data
sources may miss some unexpected geographic domains where foo.za
occurred and only two-level base-ccTLDs like foo.com.za were
expected. Not sure how to handle unusual cases like that.  I
suppose we'll need to relay on the country code authorities to be
somewhat consistent with respect to what levels they will allow
in their ccTLD.
Philosophical point: it's always possible that some spam domains
slip through the cracks, but if that happens often enough and
we spot them, we can always blacklist them manually.  Perfection
may not be possible, but we're certainly greatly increasing the
spam detection rates with this approach overall.
...
My only concern is that we leave a wide enough of a hole that
we end of playing catch-up and spammers run through various ccTLDs
that we have mis-classified using them for links.
Aha, but if a domain is not in the ccTLD list, won't we check it
on two levels on both the client and server sides and therefore
catch it?
In other words if somenewspamdomain.bg comes up, and it's not
in our ccTLD list, our client and server progams will
automatically test it as a two level domain and eventually catch
it.  In that case I think we're ok, and the only danger is
blocking new legitimate two level ccTLDs that we're not yet
aware of like newlegitimatetld.bg .
Jeff C.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [SURBL-Discuss] ccTLDs and multiple queries