Re: [SURBL-Discuss] RFC: Add hosts like tripod.com to two-level-tld list?

18 Nov 2005


      Alan,
...
Hi
2005/11/16, List Mail User track-at-plectere.com |surbl list|
<...>:
...
...
...
    JeffC of SURBL asked:


...
It has been suggested that we could deal with the tripod.com
[snip Jeff's original message/question]
    The hard-coding into applications is just plain wrong for a few

reasons;  Any list *should* be updatable.  Some RHS BLs already support
arbitrary subdomains (example: biz.mail.mud.yahoo.com has many more problems
than yahoo.com or most yahoo.com subdomains).  For DSN, RCVD and other
header rules, the "top" domain, (e.g. "ca.us") may not have, or even be
required to have an abuse and/or postmaster account (just one possible type
of example), but the subdomains either do have them or would trigger tests
or rules for them (a domain with no 'A' or 'MX' records may not need to
provide any mail services, though its subdomains should and do - your
losangeles.ca.us vs. "ca.us" example).  Also note that some MTAs (eg.
Postfix) already check *all* of the subdomains in a hostname against RHS RBLs.
    One problem with this is the syntax for DNS "wildcards" transcends

subdomain levels, so I believe the files/domains should be kept separate from
the "true" two-level TLDs, because unfortunately a check for '*.domain.tld'
should really be performed also (e.g.  '*.domain.tld' in a DNS zone file will
match 'l5.l4.l3.domain.tld' as well as the simple 'l3.domain.tld') and this
leads to increased overhead for DNS based net tests (without the check, the
containing domain cannot be determined by heuristics alone).
I don't fully "get" you're text, but are wildcards not an easy solution?
Below, but the important point is you can not expect wildcards
to be in the RHS DNSBLs themselves.
...
If a domain is blacklisted it seems normal to blacklists all subdomains.
So blacklist *.domain.tld
Yes, agreed; But only implicitly, not by adding (possibly unknown)
subdomains into the BL itself.  I mean/intend for the onus to be on the
application to check.  More below.
...
If only a subdomain is blacklisted add *.subdomain.domain.tld but not
*.domain.tld
Again, yes, agreed.  The same as the previous point - i.e. they
are "added" only logically from the application's point of view, not actually
into the RHS DNSBL.
...
*.domain.tld can be whitelisted, or it can be a central "surbl" choice
only to work with more levels for some domains.
A different issue, but a good point.  Probably even more valuable
for ESPs with known "hammy" *.l3a.SLD.TLD cases, but where you don't want
to whitelist the *.SLD.TLD case (it might be "grey" and/or other l3b.SLD.TLB
cases might be spammy).
...
Applications will benefit if they are rewritten to check the full
domain instead of a stripped domain.  As far as I understand it those
written for the current standard will keep working like they are doing
now.
One point is that there already exist applications that check
the unstripped domain and all parent domains (except for possibly treating
the TLD specially).  This means that an application would have to check
more than a single DNS entry if the FQDN has more than a single dot; What
those checks should be is part of the "wildcard" issue addressed below.
...
The nice part is that the applications don't need to store extra data.
It seems so easy to me that I'm certain that I'm missing something
(stupid).  I do understand that it's possible there will be less
caching from dns lookups. I don't know if wildcards are cached on the
wildcard level or with the exact supplied dns name.
At least with BIND, they are kept as the query, so if you query
a wildcard (i.e. *.something), that is what is cached.  It is conceivable
that other DNS implementations cache keyed on the response, which may not
always be the same thing.  I don't know what all possible name servers do.
...
I hope this is clear and not to stupid.
Yes, it is clear, and not stupid;  I should have gone into more
detail.  At issue is when an application is presented with a FQDN, what
is the first containing domain - for l5.l4.l3.SLD.TLD, it may not be any
of l5.l4.l3.SLD.TLD, l4.l3.SLD.TLD or l3.SLD.TLD in the presence of wildcards.
On the assumption (previously unstated) that you do not want to and can not
cause operators to make wildcard entries in the RHS DNSBLs;  So you would want
to first determine in reverse order (from the SLD to the subdomains) if any
wildcard entry exists in DNS to find the largest containing domain.  A failure
to do this could lead to a DoS by a spammer wildcarding SLD.TLD, but always
using URIs of a format ln.ln_minus1.ln_minus2...SLD.TLD - in effect cause
a very large number of worthless lookups which would both place load in
the name servers at both ends and add/poison the cache with a large number
of negative result entries;  The only way I see to avoid this is to determine
the first containing domain - If this behavior is adopted, there is no gain
or incentive for arbitrarily large subdomain chains and the typical case will
remain one lookup, but when two or more would be needed, an additional lookup
for every subdomain from the SLD up would be need to check for the existence
of a wildcard first (terminating if one is found), this lead to one extra
lookup for the currently common case of l3.SLD.TLD, since before the lookup,
we don't know if it is a simple host within the SLD or a subdomain, which may
be wildcarded (and thus would never be in the blacklist).  This is of course
the corner case which must be checked - what we are really trying to discover
is things like l4.l3.SLD.TLD where there exists an entry for *.SLD.TLD and
no entry for l3.SLD.TLD, or l5.l4.l3.SLD.TLD where a *.l3.SLD.TLD exists and
l3.SLD.TLD is "spammy", but SLD.TLD is not or is merely "grey", and *no*
entry for l4.l3.SLD.TLD exists because it is wildcarded (e.g. the tripod-com
case and many ESPs).  This also implies two more lookups for the common case
of l3.SLD.TLD, where the first "new" lookup is a check for the existance of
*.SLD.TLD, and if that fails (though it will often succeed) a check if the
string l2.SLD.TLD is a domain or a host, we we know which whether to present
one or query queries to the DNSBL (i.e. two if it is a domain - seen by any
SOA returned in a query - since we then want to check both the subdomain and
the parent, resulting in a total of three more lookups that are done today
in SA when l3.SLD.TLD is a subdomain, not merely a host, quite literally
four times as many for that relatively common case, but only twice as many
if a wildcard is found - common, but not universal for most spam domains).
Of course, this isn't quite as bad as it sounds, since the DNS lookup for
the actual FQDN are shared prior to and between all the DNSBL lookups (i.e.
they only need be done once, no matter how many RHS DNSBLs are in use).  Also
there is no need to abandon parallelism, sine the subdomain checks can be
performed while the DNSBL lookups on the SLD.TLD are done (which will always
be done regardless of the outcome of the wildcard and subdomain checking);
Still, there will be an increase in latency since some tests will be dependant
on the results of previos DNS queries.
I have continued to ignore the complications of processing any CNAMES
which occur and make this yet more complex than the mess above already reads.
I hope this long-winded explanation is clearer for the points which
I glossed over the first time.
...
Alain
...
[snip - the rest, which was apparently clear enough]
Paul Shupak
    track@plectere.com

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [SURBL-Discuss] RFC: Add hosts like tripod.com to two-level-tld list?