At 13:49 19/04/2004, Jeff Chan wrote:
On Sunday, April 18, 2004, 6:08:11 PM, Simon Byrnand wrote:
At 12:43 19/04/2004, Jeff Chan wrote:
- Extract base (registrar) domains from those URIs. This
includes removing any and all leading host names, subdomains, www., randomized subdomains, etc. In order to determine the base domain it may be necessary to use a table of country code TLDs (ccTLDs) such as the partially-imcomplete one SURBL uses.
[...]
If a spammer were to register a domain in NZ it would look like:
spammer.co.nz or spammer.net.nz or spammer.gen.nz etc.... randomised subdomains that they could create on their own nameservers would look like a65423xyz.spammer.co.nz or awef3242.fssf342.spammer.co.nz etc...
Will the current code (of both SpamCopURI, and the backend processing of the surbl servers for that matter) incorrectly strip this off to co.nz ? I ask, because I have definately seen dns queries from SpamCopURI trying to look up co.nz.sc.surbl.org which is wrong - that would cover a large fraction of the websites under the NZ domain heirachy, it should be
looking
up spammer.co.nz, never co.nz.
Is there any reliable way for the code to know what a base registrar
domain
is and how many tiers there are under that domain heirachy ? (May also
be a
non-trivial problem)
The traditional solution to ccTLDs (Country Code TLDs) seems to be to make a table of them, and make sure any extracted domains are +1 domain levels longer. So for company.co.nz, don't take co.nz as the base domain, but instead use company.co.nz since we know from the table that co.nz is a two level country code TLD. My slightly incomplete table of ccTLDs is at:
Hmm, well your list has .co.nz and .net.nz but not .school.nz (as an example)
What are the relative proportions of one level to two level country code TLD's ?
Are there any other one level hierachies used by countries, apart from the generic .com .org .net .biz etc ? Might be easier (and safer ?) to assume the other way around - assume its a two level country code unless listed. Then you're only having to list the top level (.com for example) rather than trying to keep track of things like .co.nz, .net.nz and so on, which are subject to change at the discretion of the local registrar...
Maybe I missed something :)
Regards, Simon