second and third level domains - again! - Discuss

List overview All Threads
Download

newer

second and third level domains - again!

older

problems when integrating...

unique IPs in the blacklist

John Fawcett

25 Apr 2004 25 Apr '04

5:57 p.m.

One of the things I noticed after upgrading to SpamCopURI 0.14 was that previously I had been identifying all mail containing ads.msn.com as spam and after the upgrade this was no longer happening.

(BTW ads.msn.com was still listed in sc.surbl.org when I observed this behaviour. It has since been removed.)

Version 0.14 includes the changes which were being discussed last week, so that if ads.msn.com is found in an email only msn.com is being checked against sc.surbl.org.

So the choices available to the list maintainer are either: - list all of msn.com - list none of msn.com

Since listing all of msn.com is likely to be too wide, this means msn.com will not get listed even if there are subdomains which are candidates for listing.

I've used msn as an example, but the same logic applies to any of the big names like yahoo etc where the list maintainer may want to have more granularity in what is listed rather than list the whole registered domain.

The solution could be to use a special return code which indicates "query again with more detail". (I remember someone bringing up something similar in the context of ccTLDs as well).

So if ads.msn.com were to be listed in sc.surbl.org it would need two records:

msn.com IN A 127.0.0.255 ads.msn.com IN A 127.0.0.2

The client (in this case SpamCopURI) would find a url ads.msn.com in the email but would query for msn.com as per the current logic.

The return value of 127.0.0.255 then indicates to the client to query for one level lower, ie ads.msn.com.

This same mechanism could be used for ccTLDs. sc.surbl.org could contain:

co.uk IN A 127.0.0.255 co.nz IN A 127.0.0.255

So that if I get xxxxxxxx.co.uk in an email, the client queries for co.uk and it will be told to query with the lower level. The client queries for xxxxxxxx.co.uk

The disadvantage of this solution is that it needs a changed logic on the client side, however the client does not need to know anything about which ccTLDs have two or three level domains. The extra query overhead is minimal, since on a busy server these additional records would almost always be in the DNS resolver cache.

Even clients which don't implement this extra logic would however have to be careful to check the value of the A record rather than deduce the listing from the presence of an A record.

Maybe we're still in time to make changes like this before there is too much software to change (currently I think there are 3).

John

Show replies by date

Eric Kolve

25 Apr 25 Apr

7:50 p.m.

On Sun, Apr 25, 2004 at 05:57:56PM +0200, John Fawcett wrote:

...

One of the things I noticed after upgrading to SpamCopURI 0.14 was that previously I had been identifying all mail containing ads.msn.com as spam and after the upgrade this was no longer happening.

(BTW ads.msn.com was still listed in sc.surbl.org when I observed this behaviour. It has since been removed.)

To take care of this in the short term you can add an entry to your spamcop_uri.cf if you have open redirect resolution on:

open_redirect_list_spamcop_uri g.msn.com ads.msn.co

I have committed this change to the trunk, so you shouldn't have to make the change by hand in the future.

The only way we can take care of this 100% of the time would be to query for 2nd - 3rd for TLD and 3rd - 4th for ccTLDs. This would catch subdomains if those were entered into the RBL. The downside is that this would increase the number of queries two fold. We could reduce the queries if we stored a wildcard A record for each the 2nd and 3rd level entries and store full entries when needed. Let me illustrate with the following examples:

RBL Hit - wildcard 3rd level URL: http://www.foo.co.uk DNS Query: www.foo.co.uk Server Record: *.foo.co.uk Result: 127.0.0.2

RBL Hit - 4th level URL: http://ads.msn.co.uk DNS Query: ads.foo.co.uk Server Record: ads.foo.co.uk # note fourth level domain Result: 127.0.0.2

RBL Miss URL: http://www.msn.com DNS Query: www.msn.com Server Record: n/a Result: n/a

RBL Hit - 3rd level URL: http://ads.msn.com DNS Query: ads.msn.com Server Record: ads.msn.com # note 3rd level Result: 127.0.0.2

RBL Hit 2nd level wildcard URL: http://bad.viagra.spammer.biz DNS Query: viagra.spammer.biz Server Record: *.spammer.biz Result: 127.0.0.2

Basically this is client querying at the subdomain level. The biggest downside to a scheme such as this is that it would result in less caching since there would inevitably be more unique queries.

--eric

...

Version 0.14 includes the changes which were being discussed last week, so that if ads.msn.com is found in an email only msn.com is being checked against sc.surbl.org.

So the choices available to the list maintainer are either:

list all of msn.com

list none of msn.com

Since listing all of msn.com is likely to be too wide, this means msn.com will not get listed even if there are subdomains which are candidates for listing.

I've used msn as an example, but the same logic applies to any of the big names like yahoo etc where the list maintainer may want to have more granularity in what is listed rather than list the whole registered domain.

The solution could be to use a special return code which indicates "query again with more detail". (I remember someone bringing up something similar in the context of ccTLDs as well).

So if ads.msn.com were to be listed in sc.surbl.org it would need two records:

msn.com IN A 127.0.0.255 ads.msn.com IN A 127.0.0.2

The client (in this case SpamCopURI) would find a url ads.msn.com in the email but would query for msn.com as per the current logic.

The return value of 127.0.0.255 then indicates to the client to query for one level lower, ie ads.msn.com.

This same mechanism could be used for ccTLDs. sc.surbl.org could contain:

co.uk IN A 127.0.0.255 co.nz IN A 127.0.0.255

So that if I get xxxxxxxx.co.uk in an email, the client queries for co.uk and it will be told to query with the lower level. The client queries for xxxxxxxx.co.uk

The disadvantage of this solution is that it needs a changed logic on the client side, however the client does not need to know anything about which ccTLDs have two or three level domains. The extra query overhead is minimal, since on a busy server these additional records would almost always be in the DNS resolver cache.

Even clients which don't implement this extra logic would however have to be careful to check the value of the A record rather than deduce the listing from the presence of an A record.

Maybe we're still in time to make changes like this before there is too much software to change (currently I think there are 3).

John _______________________________________________ Discuss mailing list Discuss@lists.surbl.org http://lists.surbl.org/mailman/listinfo/discuss

John Fawcett

8:22 p.m.

----- Original Message ----- From: "Eric Kolve"

...

On Sun, Apr 25, 2004 at 05:57:56PM +0200, John Fawcett wrote: To take care of this in the short term you can add an entry to your spamcop_uri.cf if you have open redirect resolution on:

open_redirect_list_spamcop_uri g.msn.com ads.msn.co

I have committed this change to the trunk, so you shouldn't have to make the change by hand in the future.

I think you missed my earlier post suggesting this :-) Thanks for making the update.

...

The only way we can take care of this 100% of the time would be to query for 2nd - 3rd for TLD and 3rd - 4th for ccTLDs. This would catch subdomains if those were entered into the RBL. The downside is that this would increase the number of queries two fold. We could reduce the queries if we stored a wildcard A record for each the 2nd and 3rd level entries and store full entries when needed. Let me illustrate with the following examples:

The mechanism I proposed is an alternative to wildcards and doesn't have the disadvantage of significantly increasing load.

The client would need a standard logic to query for second level. It only increases to third and beyond where the server says it has more specific data (by returning an A record with a specific value agreed for this purpose, e.g. 127.0.0.255).

In the case of a domain of x.co.uk there would only be one extra query because the client would query co.uk and then because of the return code it would know to query one level down x.co.uk. However the query for co.uk will almost certainly already be cached so the overall impact on client performance and surbl server load is a mere fraction. Or am I missing something?

John

Simon Byrnand

26 Apr 26 Apr

3:11 a.m.

At 05:50 26/04/2004, you wrote:

...

On Sun, Apr 25, 2004 at 05:57:56PM +0200, John Fawcett wrote:

...
One of the things I noticed after upgrading to SpamCopURI 0.14 was that previously I had been identifying all mail containing ads.msn.com as spam and after the upgrade this was no longer happening.

(BTW ads.msn.com was still listed in sc.surbl.org when I observed this behaviour. It has since been removed.)

To take care of this in the short term you can add an entry to your spamcop_uri.cf if you have open redirect resolution on:

open_redirect_list_spamcop_uri g.msn.com ads.msn.co

Is that line correct ? or should it be something like:

open_redirect_list_spamcop_uri g.msn.com *.ads.msn.com

? Not sure that I quite understand the syntax. Would the link you suggested cover a URL like this:

http://ads.msn.com/ads/adredir.asp?image=/ads/IMGSFS/pjoj57zhzmldxz6uc8.gif&...

Regards, Simon

Eric Kolve

4:16 a.m.

On Mon, Apr 26, 2004 at 01:11:28PM +1200, Simon Byrnand wrote:

...

At 05:50 26/04/2004, you wrote:

...
On Sun, Apr 25, 2004 at 05:57:56PM +0200, John Fawcett wrote:

...
One of the things I noticed after upgrading to SpamCopURI 0.14 was that previously I had been identifying all mail containing ads.msn.com as spam and after the upgrade this was no longer happening.

(BTW ads.msn.com was still listed in sc.surbl.org when I observed this behaviour. It has since been removed.)

To take care of this in the short term you can add an entry to your spamcop_uri.cf if you have open redirect resolution on:

open_redirect_list_spamcop_uri g.msn.com ads.msn.co

Is that line correct ? or should it be something like:

open_redirect_list_spamcop_uri g.msn.com *.ads.msn.com

*.ads.msn.com would match the following:

xxx.yyy.ads.msn.com, foo.ads.msn.com, but wouldn't match ads.msn.com.

I suppose you could do *ads.msn.com and match all of the above plus ads.msn.com, but you would also match fads.msn.com.

This is from the core address list matching that ships with SA.

ads.msn.com will only match ads.msn.com. Putting g.msn.com and ads.msn.com on the same line is no different than putting them on separate lines. I just thought I would try to keep msn stuff on the same line.

...

? Not sure that I quite understand the syntax. Would the link you suggested cover a URL like this:

http://ads.msn.com/ads/adredir.asp?image=/ads/IMGSFS/pjoj57zhzmldxz6uc8.gif&...

Yes, it will.

--eric

...

Regards, Simon

Discuss mailing list Discuss@lists.surbl.org http://lists.surbl.org/mailman/listinfo/discuss

Simon Byrnand

4:28 a.m.

At 14:16 26/04/2004, Eric Kolve wrote:

...

On Mon, Apr 26, 2004 at 01:11:28PM +1200, Simon Byrnand wrote:

...
At 05:50 26/04/2004, you wrote:

...
On Sun, Apr 25, 2004 at 05:57:56PM +0200, John Fawcett wrote:

...
One of the things I noticed after upgrading to SpamCopURI 0.14 was that previously I had been identifying all mail containing ads.msn.com as spam and after the upgrade this was no longer happening.

(BTW ads.msn.com was still listed in sc.surbl.org when I observed this behaviour. It has since been removed.)

To take care of this in the short term you can add an entry to your spamcop_uri.cf if you have open redirect resolution on:

open_redirect_list_spamcop_uri g.msn.com ads.msn.co

Is that line correct ? or should it be something like:

open_redirect_list_spamcop_uri g.msn.com *.ads.msn.com

*.ads.msn.com would match the following:

xxx.yyy.ads.msn.com, foo.ads.msn.com, but wouldn't match ads.msn.com.

I suppose you could do *ads.msn.com and match all of the above plus ads.msn.com, but you would also match fads.msn.com.

Good point... I guess I should have just meant the missing m at the end of .com

...

This is from the core address list matching that ships with SA.

ads.msn.com will only match ads.msn.com. Putting g.msn.com and ads.msn.com on the same line is no different than putting them on separate lines. I just thought I would try to keep msn stuff on the same line.

Ahhhh.... that clears things up... I thought there was some magic property to the third field that I just hadn't worked out yet :)

Regards, Simon

Eric Kolve

4:54 a.m.

On Mon, Apr 26, 2004 at 02:28:16PM +1200, Simon Byrnand wrote:

...

At 14:16 26/04/2004, Eric Kolve wrote:

...
On Mon, Apr 26, 2004 at 01:11:28PM +1200, Simon Byrnand wrote:

...
At 05:50 26/04/2004, you wrote:

...
On Sun, Apr 25, 2004 at 05:57:56PM +0200, John Fawcett wrote:

...
One of the things I noticed after upgrading to SpamCopURI 0.14 was that previously I had been identifying all mail containing ads.msn.com as spam and after the upgrade this was no longer happening.

(BTW ads.msn.com was still listed in sc.surbl.org when I observed this behaviour. It has since been removed.)

To take care of this in the short term you can add an entry to your spamcop_uri.cf if you have open redirect resolution on:

open_redirect_list_spamcop_uri g.msn.com ads.msn.co

Is that line correct ? or should it be something like:

open_redirect_list_spamcop_uri g.msn.com *.ads.msn.com

*.ads.msn.com would match the following:

xxx.yyy.ads.msn.com, foo.ads.msn.com, but wouldn't match ads.msn.com.

I suppose you could do *ads.msn.com and match all of the above plus ads.msn.com, but you would also match fads.msn.com.

Good point... I guess I should have just meant the missing m at the end of .com

I totally missed that. I double checked the file that I committed and its correct. Sorry about the confusion.

--eric

...

...
This is from the core address list matching that ships with SA.

ads.msn.com will only match ads.msn.com. Putting g.msn.com and ads.msn.com on the same line is no different than putting them on separate lines. I just thought I would try to keep msn stuff on the same line.

Ahhhh.... that clears things up... I thought there was some magic property to the third field that I just hadn't worked out yet :)

Regards, Simon

Discuss mailing list Discuss@lists.surbl.org http://lists.surbl.org/mailman/listinfo/discuss

Jeff Chan

12:03 p.m.

On Sunday, April 25, 2004, 10:50:08 AM, Eric Kolve wrote:

...

RBL Hit - wildcard 3rd level URL: http://www.foo.co.uk DNS Query: www.foo.co.uk Server Record: *.foo.co.uk Result: 127.0.0.2

I'd rather catch all of foo.co.uk. If they were a real spammer domain we should hose them entirely.

...

RBL Hit - 3rd level URL: http://ads.msn.com DNS Query: ads.msn.com Server Record: ads.msn.com # note 3rd level Result: 127.0.0.2

Possible, but very unlikely case. Legitimate domains generally aren't going to have large, abusive subdomains. Redirection sites can't really be listed for some of the reasons mentioned in the previous message. However purely spam redirection sites could be listed.

...

RBL Hit 2nd level wildcard URL: http://bad.viagra.spammer.biz DNS Query: viagra.spammer.biz Server Record: *.spammer.biz Result: 127.0.0.2

I'd rather hit all of spammer.biz .

In my simplistic view of the world, wildcards don't buy us much. A base domain is either legitimate or it isn't.

Jeff C.

Jeff Chan

8:37 a.m.

On Sunday, April 25, 2004, 8:57:56 AM, John Fawcett wrote:

...

(BTW ads.msn.com was still listed in sc.surbl.org when I observed this behaviour. It has since been removed.)

Yes, I added both redirection sites to the SURBL whitelist, which is probably unnecessary but couldn't hurt since the redirectors themselves are legitimate (if unwise) sites.

Jeff C.

Jose Marcio Martins da Cruz

9:25 a.m.

Second and third levels from France...

.fr domains are reserved only to government and for companies legally registered in France. Foreign companies with an office in France, but without legal register can't have a ".fr." domain.

Also, there are second level domains dedicated to some professional activities : e.g. ".avocat.fr" for the lawyers...

So there are many second level tlds in france (I couldn't get the complete list - I'll try to do as soon as I have time to ) :

.fr .tm.fr .gouv.fr .asso.fr .nom.fr .avocat.fr .notaire.fr .barreau.fr .mairie.fr ...

".re" domains (Ile de la Reunion) follow the same rules.

Best

Jose-Marcio

Jeff Chan wrote:

...

On Sunday, April 25, 2004, 8:57:56 AM, John Fawcett wrote:

...
(BTW ads.msn.com was still listed in sc.surbl.org when I observed this behaviour. It has since been removed.)

Yes, I added both redirection sites to the SURBL whitelist, which is probably unnecessary but couldn't hurt since the redirectors themselves are legitimate (if unwise) sites.

Jeff C.

Discuss mailing list Discuss@lists.surbl.org http://lists.surbl.org/mailman/listinfo/discuss

-- --------------------------------------------------------------- Jose Marcio MARTINS DA CRUZ Tel. :(33) 01.40.51.93.41 Ecole des Mines de Paris http://j-chkmail.ensmp.fr 60, bd Saint Michel http://www.ensmp.fr/~martins 75272 - PARIS CEDEX 06 mailto:Jose-Marcio.Martins@ensmp.fr

Jeff Chan

11:58 a.m.

On Sunday, April 25, 2004, 8:57:56 AM, John Fawcett wrote:

...

Version 0.14 includes the changes which were being discussed last week, so that if ads.msn.com is found in an email only msn.com is being checked against sc.surbl.org.

Regarding redirection sites, we definitely do not want to blacklist the redirection sites of mostly legitimate sites like msn and yahoo.

If their redirection sites are being abused by spammers then they should have the added traffic as incentive to block the abusers.

If there were spam-only redirection sites then those could be easily blocked on.

Another reason to not block legitimate redirection sites is that SA 3.0's urirhsbl will check the redirection site against the SURBL also, and we don't want to block messages simply because a redirection site is used in them. On the other hand the redirection site could get a special A record such as you propose below to say "check further". But that's getting a bit complex on the client side for my likes.

...

So the choices available to the list maintainer are either:

list all of msn.com

list none of msn.com

...

Since listing all of msn.com is likely to be too wide, this means msn.com will not get listed even if there are subdomains which are candidates for listing.

...

I've used msn as an example, but the same logic applies to any of the big names like yahoo etc where the list maintainer may want to have more granularity in what is listed rather than list the whole registered domain.

The underlying principle as I see it is that most major site will have functional anti-abuse and anti-spam policies, so either a base domain is good or bad. I know that seems simplistic, but it's easy and fast to implement AND it seems to match reality pretty well.

There are no drug spam sites hosted on yahoo for example, and if there were they would get shut down extremely quickly. The legitimate sites have an incentive to stay that way. Similarly spam ISPs and spam gangs have a seeming incentive to stay that way.

So the dividing line can generally be easily drawn at the registrar domain level.

...

The solution could be to use a special return code which indicates "query again with more detail". (I remember someone bringing up something similar in the context of ccTLDs as well).

...

So if ads.msn.com were to be listed in sc.surbl.org it would need two records:

...

msn.com IN A 127.0.0.255 ads.msn.com IN A 127.0.0.2

...

The client (in this case SpamCopURI) would find a url ads.msn.com in the email but would query for msn.com as per the current logic.

...

The return value of 127.0.0.255 then indicates to the client to query for one level lower, ie ads.msn.com.

...

This same mechanism could be used for ccTLDs. sc.surbl.org could contain:

...

co.uk IN A 127.0.0.255 co.nz IN A 127.0.0.255

...

So that if I get xxxxxxxx.co.uk in an email, the client queries for co.uk and it will be told to query with the lower level. The client queries for xxxxxxxx.co.uk

That's an interesting idea. Basically you want to signal redirection to higher domain levels with a special result for levels that should never get checked like co.uk.

That might be doable, but it would require extra logic on the client side as you note. That already sounds more complex than I like, though I see what you're getting at. Better to control what goes into the data (i.e. never let the TLD itself co.uk in), and make sure the client is following similar rules.

We will always catch bigspammer.co.uk with the current strategy.

I know a lot of what I argue for above seems simplistic when a more complex solution could have more interesting results, but very often the simpler solutions are better, especially in terms of resource consumption.

Jeff C.

7737

Age (days ago)

7738

Last active (days ago)

discuss@lists.surbl.org

10 comments

5 participants

tags (0)

participants (5)

Eric Kolve
Jeff Chan
John Fawcett
Jose Marcio Martins da Cruz
Simon Byrnand