Hello,
Looking at the multi.surbl.org zone yesterday, I noticed approximately 373 subdomains in the list.
Here are a few examples:
www.fcudwedenagov.com www.freecat.biz www.hesvlabean.com www.hterrani.com ms7.pptel.net msn.41m.com mwetillf.iscool.net mx.servebbs.net mx2.dynu.net www.yelvertonstores.co.uk
Looking at http://www.surbl.org/implementation.html item 2, do these subdomains belong in the list?
"Extract base (registrar) domains from those URIs. This includes removing any and all leading host names, subdomains, www., randomized subdomains, etc. In order to determine the base domain it may be necessary to use a table of country code TLDs (ccTLDs) such as this partially-complete one SURBL uses. (Note that this file is only rarely updated. Please don't download it frequently.) For example, any domain found in the two level ccTLD list should have a three-level domain name extracted (like foo.co.uk) for matching against a SURBL. Domains not specifically on the two level ccTLD list (such as foo.com or foo.fr) should be checked at two levels."
I believe SpamAssassin's URIDNSBL reduces the URIs to the base domain (e.g. example.com, example.co.uk), so if it encountered "www.freecat.biz," for example, it would lookup freecat.biz, which is not in the list.
Besides URIDNSBL, are there other URI lookup implementations for which it makes sense to include subdomains?
Thanks!
Brandon
On Friday, May 12, 2006, 7:59:57 AM, Brandon Hutchinson wrote:
Hello,
Looking at the multi.surbl.org zone yesterday, I noticed approximately 373 subdomains in the list.
Here are a few examples:
www.fcudwedenagov.com www.freecat.biz www.hesvlabean.com www.hterrani.com ms7.pptel.net msn.41m.com mwetillf.iscool.net mx.servebbs.net mx2.dynu.net www.yelvertonstores.co.uk
Looking at http://www.surbl.org/implementation.html item 2, do these subdomains belong in the list?
"Extract base (registrar) domains from those URIs. This includes removing any and all leading host names, subdomains, www., randomized subdomains, etc. In order to determine the base domain it may be necessary to use a table of country code TLDs (ccTLDs) such as this partially-complete one SURBL uses. (Note that this file is only rarely updated. Please don't download it frequently.) For example, any domain found in the two level ccTLD list should have a three-level domain name extracted (like foo.co.uk) for matching against a SURBL. Domains not specifically on the two level ccTLD list (such as foo.com or foo.fr) should be checked at two levels."
Most of the listed records with subdomains deeper than we'd normally list are from phishes. It's true that they don't follow the specification, but they're exceptional. Most of the domains *are* reduced to registered levels on the data side, where it's clear the domains belong to the phishers or spammers.
I believe SpamAssassin's URIDNSBL reduces the URIs to the base domain (e.g. example.com, example.co.uk), so if it encountered "www.freecat.biz," for example, it would lookup freecat.biz, which is not in the list.
That's correct. It may check other levels too, but the spec says to check GTLDs at the second level and CCTLDs in the table at the third. There may be other outlying cases in terms of the number of levels that should be checked, but two and three levels of GTLDs and CCTLDs certainly covers most of the common spams.
Besides URIDNSBL, are there other URI lookup implementations for which it makes sense to include subdomains?
Not sure I understand the question. Can you elaborate?
It may help to know what problem you're trying to solve.
Jeff C. -- Don't harm innocent bystanders.
Hi Jeff,
I believe SpamAssassin's URIDNSBL reduces the URIs to the base domain (e.g. example.com, example.co.uk), so if it encountered "www.freecat.biz," for example, it would lookup freecat.biz, which is not in the list.
That's correct. It may check other levels too, but the spec says to check GTLDs at the second level and CCTLDs in the table at the third. There may be other outlying cases in terms of the number of levels that should be checked, but two and three levels of GTLDs and CCTLDs certainly covers most of the common spams.
Besides URIDNSBL, are there other URI lookup implementations for which it makes sense to include subdomains?
Not sure I understand the question. Can you elaborate?
Since I don't think including subdomains in SURBL zone data does any good with SpamAssassin's URIDNSBL implementation, I was just wondering what else people are using to look up URIs in SURBL. Other sendmail milters that do not use URIDNSBL? Custom MIMEDefang code?
I don't have any problem with subdomains being included in the list. I'm just wondering "Who is benefiting from having subdomains in the list?"
Using the "www.freecat.biz" example: assuming this is a phishing domain, would also including "freecat.biz" in SURBL be a bad idea? Are there cases where we should "trust" the base domain even when a subdomain is used in a phishing email?
Thanks,
Brandon
On Friday, May 12, 2006, 9:31:57 AM, Brandon Hutchinson wrote:
Since I don't think including subdomains in SURBL zone data does any good with SpamAssassin's URIDNSBL implementation, I was just wondering what else people are using to look up URIs in SURBL. Other sendmail milters that do not use URIDNSBL? Custom MIMEDefang code?
SpamAssasisn may check more than the specified levels. For example, it may check at levels two and three on GTLDs, or at least it did at one point.
I don't have any problem with subdomains being included in the list. I'm just wondering "Who is benefiting from having subdomains in the list?"
Using the "www.freecat.biz" example: assuming this is a phishing domain, would also including "freecat.biz" in SURBL be a bad idea? Are there cases where we should "trust" the base domain even when a subdomain is used in a phishing email?
If a subdomain is listed, the subdomain should be checked. It's not necessarily safe to check the base domain when a subdomain is listed. For example if phishing.freehost.com is blacklisted, checking freehost.com is probably not a good idea. I do realize this is somewhat off spec.
Jeff C. -- Don't harm innocent bystanders.
SpamAssasisn may check more than the specified levels. For example, it may check at levels two and three on GTLDs, or at least it did at one point.
Looking at some of the SA 3.1.1 debug output, SA's URIDNSBL will query only at level 3 for domains with a country code (e.g. .co.uk), and level 2 for other GTLDs (.com).
Examples:
[5180] dbg: uri: parsed uri found, http://www.hydeparkcalling.co.uk/ [5180] dbg: uri: parsed domain, hydeparkcalling.co.uk [5180] dbg: uridnsbl: domains to query: hydeparkcalling.co.uk
[6977] dbg: uri: parsed uri found, http://www.manage-performance.com [6977] dbg: uri: parsed domain, manage-performance.com [6977] dbg: uridnsbl: domains to query: manage-performance.com
So unless my understanding of SA's URIDNSBL is mistaken, and it certainly could be, we'll never catch any of the subdomains in SURBL. No big deal; someone probably is using some implementation of URI checking with SURBL that does.
If a subdomain is listed, the subdomain should be checked. It's not necessarily safe to check the base domain when a subdomain is listed. For example if phishing.freehost.com is blacklisted, checking freehost.com is probably not a good idea. I do realize this is somewhat off spec.
Thanks, this is what I was wondering.
Brandon
On Friday, May 12, 2006, 11:16:40 AM, Brandon Hutchinson wrote:
SpamAssasisn may check more than the specified levels. For example, it may check at levels two and three on GTLDs, or at least it did at one point.
Looking at some of the SA 3.1.1 debug output, SA's URIDNSBL will query only at level 3 for domains with a country code (e.g. .co.uk), and level 2 for other GTLDs (.com).
Examples:
[5180] dbg: uri: parsed uri found, http://www.hydeparkcalling.co.uk/ [5180] dbg: uri: parsed domain, hydeparkcalling.co.uk [5180] dbg: uridnsbl: domains to query: hydeparkcalling.co.uk
[6977] dbg: uri: parsed uri found, http://www.manage-performance.com [6977] dbg: uri: parsed domain, manage-performance.com [6977] dbg: uridnsbl: domains to query: manage-performance.com
So unless my understanding of SA's URIDNSBL is mistaken, and it certainly could be, we'll never catch any of the subdomains in SURBL.
Yes, it's possible SA is coded exactly to spec now, and some of these non-spec data won't get caught with SA.
No big deal; someone probably is using some implementation of URI checking with SURBL that does.
Yes, it's possible.
Jeff C. -- Don't harm innocent bystanders.
On Friday, May 12, 2006, 9:53:41 AM, Jeff Chan wrote:
If a subdomain is listed, the subdomain should be checked. It's not necessarily safe to check the base domain when a subdomain is listed. For example if phishing.freehost.com is blacklisted, checking freehost.com is probably not a good idea. I do realize this is somewhat off spec.
It's been pointed out that the description above may be somewhat unclear. To clarify, it's best to follow the specification:
http://www.surbl.org/implementation.html
1. For GTLDs like com, net, org, info, biz, etc., check at the second level.
2. For CCTLDs listed in the two-level-tlds list, check at the third level, etc. For CCTLDs not in that list, check at the second level.
A vast majority of the time, those will match the levels in the blacklist. In a few off-spec cases we blacklist subdomains, but they are very rare and exceptional. It's best not to code to those rare exceptions, especially as it can double, triple, etc, the DNS queries largely unnecessarily.
The point about listed subdomains such as phishing.freehost.com was to *not* check levels closer to the root (even if I didn't explain that very clearly in the quote above). While phishing.freehost.com may be bad (and in theory ok to check), freehost.com may not be. Checking freehost.com could easily lead to FPs.
Really the best advice is to ignore the off-spec data. It doesn't help the results very much and arguably doesn't even belong in there.
Cheers,
Jeff C. -- Don't harm innocent bystanders.
Hi Brandon, At 09:31 12-05-2006, Brandon Hutchinson wrote:
Since I don't think including subdomains in SURBL zone data does any good with SpamAssassin's URIDNSBL implementation, I was just wondering what else people are using to look up URIs in SURBL. Other sendmail milters that do not use
I use a milter.
Using the "www.freecat.biz" example: assuming this is a phishing domain, would also including "freecat.biz" in SURBL be a bad idea? Are there cases where we should "trust" the base domain even when a subdomain is used in a phishing email?
You would look up freecat.biz in the above example. See http://www.surbl.org/implementation.html for implementation guidelines. If it is a phishing email, I would not trust the base domain. A whitelist is also required.
Regards, -sm
On Friday, May 12, 2006, 12:47:10 PM, SM SM wrote:
At 09:31 12-05-2006, Brandon Hutchinson wrote:
Using the "www.freecat.biz" example: assuming this is a phishing domain, would also including "freecat.biz" in SURBL be a bad idea? Are there cases where we should "trust" the base domain even when a subdomain is used in a phishing email?
You would look up freecat.biz in the above example. See http://www.surbl.org/implementation.html for implementation guidelines. If it is a phishing email, I would not trust the base domain.
Probably we're not providing enough context to be clear. Brandon's concern was that there were records like www.freecat.biz in the blacklists that won't match the type of checking specified in the Implementation Guidelines:
http://www.surbl.org/implementation.html
Normally we would blacklist freecat.biz, not www.freecat.biz, if the domain were known bad. In a few rare cases hosts or subdomains are blacklisted where the domain may be ok, but the host or subdomain isn't. So phishing.legitimate-free-host.com might be blacklisted. That actually violates our own specification, so in a sense it's not too clever for us to blacklist. So that's addressing an inconsistency on the blacklist data side.
On the application side, if phishing.legitimate-free-host.com or www.freecat.biz appeared in a message, they should properly be reduced to legitimate-free-host.com and freecat.biz before checking against the blacklists. Unless the unqualified domains were actually blacklisted, they would not match (www.freecat.biz is not the same as freecat.biz). In a sense that is an error: a mismatch between the blacklist data and the application's handling of message URI data. But the error is really on the data side, so there's no need to do anything off-spec with the applications. Yes, it may cause a few spams or phishes to be missed, but they're very rare and obscure.
HTH,
Jeff C. -- Don't harm innocent bystanders.
Jeff Chan wrote:
On Friday, May 12, 2006, 12:47:10 PM, SM SM wrote:
...
On the application side, if phishing.legitimate-free-host.com or www.freecat.biz appeared in a message, they should properly be reduced to legitimate-free-host.com and freecat.biz before checking against the blacklists. Unless the unqualified domains were actually blacklisted, they would not match (www.freecat.biz is not the same as freecat.biz). In a sense that is an error: a mismatch between the blacklist data and the application's handling of message URI data. But the error is really on the data side, so there's no need to do anything off-spec with the applications. Yes, it may cause a few spams or phishes to be missed, but they're very rare and obscure.
j-chkmail don't bother with the number of parts of the domain. When it finds an URL with N parts (say pN.pN-1.pN-2...p2.p1), it does N checks and stops when if finds something matching. So, if you list www.freecat.biz, j-chkmail will detect www.freecat.biz and xxx.www.freecat.biz, but not freecat.biz.
I think it's good to have subdomains listed in SURBL.
Regards,
Jose-Marcio