Alan,
Hi
2005/11/16, List Mail User track-at-plectere.com |surbl list| <...>:
...
JeffC of SURBL asked:
It has been suggested that we could deal with the tripod.com [snip Jeff's original message/question]
The hard-coding into applications is just plain wrong for a few
reasons; Any list *should* be updatable. Some RHS BLs already support arbitrary subdomains (example: biz.mail.mud.yahoo.com has many more problems than yahoo.com or most yahoo.com subdomains). For DSN, RCVD and other header rules, the "top" domain, (e.g. "ca.us") may not have, or even be required to have an abuse and/or postmaster account (just one possible type of example), but the subdomains either do have them or would trigger tests or rules for them (a domain with no 'A' or 'MX' records may not need to provide any mail services, though its subdomains should and do - your losangeles.ca.us vs. "ca.us" example). Also note that some MTAs (eg. Postfix) already check *all* of the subdomains in a hostname against RHS RBLs.
One problem with this is the syntax for DNS "wildcards" transcends
subdomain levels, so I believe the files/domains should be kept separate from the "true" two-level TLDs, because unfortunately a check for '*.domain.tld' should really be performed also (e.g. '*.domain.tld' in a DNS zone file will match 'l5.l4.l3.domain.tld' as well as the simple 'l3.domain.tld') and this leads to increased overhead for DNS based net tests (without the check, the containing domain cannot be determined by heuristics alone).
I don't fully "get" you're text, but are wildcards not an easy solution?
Below, but the important point is you can not expect wildcards to be in the RHS DNSBLs themselves.
If a domain is blacklisted it seems normal to blacklists all subdomains. So blacklist *.domain.tld
Yes, agreed; But only implicitly, not by adding (possibly unknown) subdomains into the BL itself. I mean/intend for the onus to be on the application to check. More below.
If only a subdomain is blacklisted add *.subdomain.domain.tld but not *.domain.tld
Again, yes, agreed. The same as the previous point - i.e. they are "added" only logically from the application's point of view, not actually into the RHS DNSBL.
*.domain.tld can be whitelisted, or it can be a central "surbl" choice only to work with more levels for some domains.
A different issue, but a good point. Probably even more valuable for ESPs with known "hammy" *.l3a.SLD.TLD cases, but where you don't want to whitelist the *.SLD.TLD case (it might be "grey" and/or other l3b.SLD.TLB cases might be spammy).
Applications will benefit if they are rewritten to check the full domain instead of a stripped domain. As far as I understand it those written for the current standard will keep working like they are doing now.
One point is that there already exist applications that check the unstripped domain and all parent domains (except for possibly treating the TLD specially). This means that an application would have to check more than a single DNS entry if the FQDN has more than a single dot; What those checks should be is part of the "wildcard" issue addressed below.
The nice part is that the applications don't need to store extra data.
It seems so easy to me that I'm certain that I'm missing something (stupid). I do understand that it's possible there will be less caching from dns lookups. I don't know if wildcards are cached on the wildcard level or with the exact supplied dns name.
At least with BIND, they are kept as the query, so if you query a wildcard (i.e. *.something), that is what is cached. It is conceivable that other DNS implementations cache keyed on the response, which may not always be the same thing. I don't know what all possible name servers do.
I hope this is clear and not to stupid.
Yes, it is clear, and not stupid; I should have gone into more detail. At issue is when an application is presented with a FQDN, what is the first containing domain - for l5.l4.l3.SLD.TLD, it may not be any of l5.l4.l3.SLD.TLD, l4.l3.SLD.TLD or l3.SLD.TLD in the presence of wildcards. On the assumption (previously unstated) that you do not want to and can not cause operators to make wildcard entries in the RHS DNSBLs; So you would want to first determine in reverse order (from the SLD to the subdomains) if any wildcard entry exists in DNS to find the largest containing domain. A failure to do this could lead to a DoS by a spammer wildcarding SLD.TLD, but always using URIs of a format ln.ln_minus1.ln_minus2...SLD.TLD - in effect cause a very large number of worthless lookups which would both place load in the name servers at both ends and add/poison the cache with a large number of negative result entries; The only way I see to avoid this is to determine the first containing domain - If this behavior is adopted, there is no gain or incentive for arbitrarily large subdomain chains and the typical case will remain one lookup, but when two or more would be needed, an additional lookup for every subdomain from the SLD up would be need to check for the existence of a wildcard first (terminating if one is found), this lead to one extra lookup for the currently common case of l3.SLD.TLD, since before the lookup, we don't know if it is a simple host within the SLD or a subdomain, which may be wildcarded (and thus would never be in the blacklist). This is of course the corner case which must be checked - what we are really trying to discover is things like l4.l3.SLD.TLD where there exists an entry for *.SLD.TLD and no entry for l3.SLD.TLD, or l5.l4.l3.SLD.TLD where a *.l3.SLD.TLD exists and l3.SLD.TLD is "spammy", but SLD.TLD is not or is merely "grey", and *no* entry for l4.l3.SLD.TLD exists because it is wildcarded (e.g. the tripod-com case and many ESPs). This also implies two more lookups for the common case of l3.SLD.TLD, where the first "new" lookup is a check for the existance of *.SLD.TLD, and if that fails (though it will often succeed) a check if the string l2.SLD.TLD is a domain or a host, we we know which whether to present one or query queries to the DNSBL (i.e. two if it is a domain - seen by any SOA returned in a query - since we then want to check both the subdomain and the parent, resulting in a total of three more lookups that are done today in SA when l3.SLD.TLD is a subdomain, not merely a host, quite literally four times as many for that relatively common case, but only twice as many if a wildcard is found - common, but not universal for most spam domains). Of course, this isn't quite as bad as it sounds, since the DNS lookup for the actual FQDN are shared prior to and between all the DNSBL lookups (i.e. they only need be done once, no matter how many RHS DNSBLs are in use). Also there is no need to abandon parallelism, sine the subdomain checks can be performed while the DNSBL lookups on the SLD.TLD are done (which will always be done regardless of the outcome of the wildcard and subdomain checking); Still, there will be an increase in latency since some tests will be dependant on the results of previos DNS queries.
I have continued to ignore the complications of processing any CNAMES which occur and make this yet more complex than the mess above already reads.
I hope this long-winded explanation is clearer for the points which I glossed over the first time.
Alain
[snip - the rest, which was apparently clear enough]
Paul Shupak track@plectere.com