Most often "hit" SURBL domains

List overview All Threads
Download

newer

older

RE: [SURBL-Discuss] FP:...

Blogs still using SURBL?

Rob McEwen

30 Sep 2004 30 Sep '04

1:42 a.m.

RE: Most often "hit" SURBL domains

...

From time to time, ideas float around about how we can take some pressure

off of the SURBL name servers. Recently, most commonly queried URIs that are NOT (and should not) be blocked were mentioned in the hopes that people would "whitelist" these locally so their mail servers would stop querying SURBL for stuff like microsoft.com, ebay.com, etc.

I have a similar idea. Would it be possible to have a running list of the top 20 (or so... 50? 100?) most often queried URI's that are blocked by SURBL (and which should be blocked)? This way, we could take additional pressure off SURBL DNS servers by blacklisting these domains locally BEFORE doing SURBL checking on such messages?

I have a feeling that this has already been requested and implemented??

Rob McEwen

Show replies by date

Raymond Dijkxhoorn

30 Sep 30 Sep

1:48 a.m.

Hi!

...

I have a similar idea. Would it be possible to have a running list of the top 20 (or so... 50? 100?) most often queried URI's that are blocked by SURBL (and which should be blocked)? This way, we could take additional pressure off SURBL DNS servers by blacklisting these domains locally BEFORE doing SURBL checking on such messages?

I have a feeling that this has already been requested and implemented??

Its something that is suggested, and we are looking into ways to getting that inside for example SA3.1, the SA guys also had some suggestions.

So yes, excellent idea... ;)

Bye, Raymond.

Theo Van Dinter

1:56 a.m.

On Thu, Sep 30, 2004 at 01:48:36AM +0200, Raymond Dijkxhoorn wrote:

...

Its something that is suggested, and we are looking into ways to getting that inside for example SA3.1, the SA guys also had some suggestions.

I'm trying to get it into 3.0.1 as well, btw.

-- Randomly Generated Tagline: I view the JVM as just another architecture that Perl ought to be ported to. (That, and the Underwood typewriter...) -- Larry Wall in 199808050415.VAA24026@wall.org

Jeff Chan

1:59 a.m.

On Wednesday, September 29, 2004, 4:48:36 PM, Raymond Dijkxhoorn wrote:

...

Hi!

...

...
I have a similar idea. Would it be possible to have a running list of the top 20 (or so... 50? 100?) most often queried URI's that are blocked by SURBL (and which should be blocked)? This way, we could take additional pressure off SURBL DNS servers by blacklisting these domains locally BEFORE doing SURBL checking on such messages?

I have a feeling that this has already been requested and implemented??

...

Its something that is suggested, and we are looking into ways to getting that inside for example SA3.1, the SA guys also had some suggestions.

...

So yes, excellent idea... ;)

Yes, SA is adding a feature to hardcode or have a database of the 125 most often hit whitelist domains to 3.1 or 3.0.1. This will prevent domains like w3.org, yahoo.com, etc. from even being queried.

One issue is with the top spammer domains is that unlike the whitehats, the big spammer domains tend to change over time. The biggest spammers also seem to be the most dynamic.

So the whitehats may work better with local listing than the blackhats.

This is certainly a good idea though. Note that Eric Kolve also built in local black and whitelists to SpamCopURI.

Jeff C. -- "If it appears in hams, then don't list it."

Rob McEwen

2:13 a.m.

Regarding the top x number of most frequently queried blacklisted URIs, I was hoping that these could, perhaps, be listed in a simple text file that would be built once or twice a day in, hopefully, some kind of automated process. Because I don't use SpamAssassin, I don't know if the other ideas of implementing this would help me, but I could build a program which could periodically (once a day... every few hours) download this "most requested blocked" list and keep these blocked on my server as described. Would this be possible or practical?

Rob McEwen

-----Original Message----- From: discuss-bounces@lists.surbl.org [mailto:discuss-bounces@lists.surbl.org] On Behalf Of Jeff Chan Sent: Wednesday, September 29, 2004 8:00 PM To: SURBL Discuss Subject: Re: [SURBL-Discuss] Most often "hit" SURBL domains

On Wednesday, September 29, 2004, 4:48:36 PM, Raymond Dijkxhoorn wrote:

...

Hi!

...

...
I have a similar idea. Would it be possible to have a running list of the top 20 (or so... 50? 100?) most often queried URI's that are blocked by SURBL (and which should be blocked)? This way, we could take additional pressure off SURBL DNS servers by blacklisting these domains locally

BEFORE

...

...
doing SURBL checking on such messages?

I have a feeling that this has already been requested and implemented??

...

Its something that is suggested, and we are looking into ways to getting that inside for example SA3.1, the SA guys also had some suggestions.

...

So yes, excellent idea... ;)

Yes, SA is adding a feature to hardcode or have a database of the 125 most often hit whitelist domains to 3.1 or 3.0.1. This will prevent domains like w3.org, yahoo.com, etc. from even being queried.

One issue is with the top spammer domains is that unlike the whitehats, the big spammer domains tend to change over time. The biggest spammers also seem to be the most dynamic.

So the whitehats may work better with local listing than the blackhats.

This is certainly a good idea though. Note that Eric Kolve also built in local black and whitelists to SpamCopURI.

Jeff C. -- "If it appears in hams, then don't list it."

_______________________________________________ Discuss mailing list Discuss@lists.surbl.org http://lists.surbl.org/mailman/listinfo/discuss

Jeff Chan

2:24 a.m.

On Wednesday, September 29, 2004, 5:13:00 PM, Rob McEwen wrote:

...

Regarding the top x number of most frequently queried blacklisted URIs, I was hoping that these could, perhaps, be listed in a simple text file that would be built once or twice a day in, hopefully, some kind of automated process. Because I don't use SpamAssassin, I don't know if the other ideas of implementing this would help me, but I could build a program which could periodically (once a day... every few hours) download this "most requested blocked" list and keep these blocked on my server as described. Would this be possible or practical?

...

Rob McEwen

I'm sure it could be done, but DNS is easier. Note also the repeated DNS queries on the same domain are hopefully cached by the local resolver at least within the positive caching TTL.

If the zone file expire time applies, then the queries to the authoritative name servers would be minimal. (If the TTL applies, then the queries could be pretty frequent for a commonly appearing spam URI domain.)

Perhaps this is an experiment you could try (to check the positive caching behavior) for us?

Jeff C. -- "If it appears in hams, then don't list it."

Rob McEwen

3:03 a.m.

Jeff said:

...

Note also the repeated DNS queries on the same domain are hopefully cached by the local resolver at least within the positive caching TTL. ...Perhaps this is an experiment you could try (to check the positive caching behavior) for us

Good point, Jeff. I'll check into this.

However, I still think my idea may be more efficient overall because some non-hits will be checked with each message sometimes before the "guilty" URI is found. If a spammer purposely adds a variety of non-spams, either through purposeful poisoning, or incidentally via other typical obfuscation or "mixing it up" techniques... then this could mean a large variety of URIs looked up which could have been avoided??

Or, is this scenario I describe far fetched and not representative of what would actually happen?

Rob McEwen

Jeff Chan

3:27 a.m.

On Wednesday, September 29, 2004, 6:03:32 PM, Rob McEwen wrote:

...

Jeff said:

...

...
Note also the repeated DNS queries on the same domain are hopefully cached by the local resolver at least within the positive caching TTL. ...Perhaps this is an experiment you could try (to check the positive caching behavior) for us

...

Good point, Jeff. I'll check into this.

...

However, I still think my idea may be more efficient overall because some non-hits will be checked with each message sometimes before the "guilty" URI is found. If a spammer purposely adds a variety of non-spams, either through purposeful poisoning, or incidentally via other typical obfuscation or "mixing it up" techniques... then this could mean a large variety of URIs looked up which could have been avoided??

...

Or, is this scenario I describe far fetched and not representative of what would actually happen?

...

Rob McEwen

Most of the professional spams I've seen lately seem to have only the spammer's own domain in them.

Non-hits in general would be a potential problem given that most domains occurring in messages are neither on our white nor block lists. But fortunately the negative caching function of DNS will cache the non-hits just as positive caching caches the hits.

The downside is that those are subject to the negative caching TTL, so they will get re-queried against an authoritative server after the negative caching interval passes for a given record.

We have already tuned the positive and negative caching TTLs experimentally to 15 minutes. This is a value that appears to optimize both name server traffic and latency of records entering and leaving the subdomains (i.e. the lists). It may not be useful to tune these further, especially since the local whitelist function will help a lot with a large chunk of the most common negative caching of yahoo.com, w3.org, etc.

Jeff C. -- "If it appears in hams, then don't list it."

Andy Warner

5:07 a.m.

On Wed, 29 Sep 2004, Jeff Chan wrote:

...

On Wednesday, September 29, 2004, 6:03:32 PM, Rob McEwen wrote:

...
However, I still think my idea may be more efficient overall because some non-hits will be checked with each message sometimes before the "guilty" URI is found. If a spammer purposely adds a variety of non-spams, either through purposeful poisoning, or incidentally via other typical obfuscation or "mixing it up" techniques... then this could mean a large variety of URIs looked up which could have been avoided??

...
Or, is this scenario I describe far fetched and not representative of what would actually happen?

...
Rob McEwen

Most of the professional spams I've seen lately seem to have only the spammer's own domain in them.

Non-hits in general would be a potential problem given that most domains occurring in messages are neither on our white nor block lists. But fortunately the negative caching function of DNS will cache the non-hits just as positive caching caches the hits.

The downside is that those are subject to the negative caching TTL, so they will get re-queried against an authoritative server after the negative caching interval passes for a given record.

We have already tuned the positive and negative caching TTLs experimentally to 15 minutes. This is a value that appears to optimize both name server traffic and latency of records entering and leaving the subdomains (i.e. the lists). It may not be useful to tune these further, especially since the local whitelist function will help a lot with a large chunk of the most common negative caching of yahoo.com, w3.org, etc.

Jeff C.

If somebody wants to try a quick and dirty proof of concept on this without using the name servers and actual query volume I'm guessing the AbuseButler volume information correlates quite closely to the DNS query volume for most domains. (say in the neighborhood of .8 to .9 correl. if I had to guess). I can produce data in just about any format you'd like as all my data in in Postgres - converting a daily volume report into a local BL would be fairly trivial for a quick proof of concept.

e.g.: http://spamvertised.abusebutler.com/spamvertised.php?rep=last24

-- Andy

Steven Champeon

4:33 p.m.

on Wed, Sep 29, 2004 at 06:27:37PM -0700, Jeff Chan wrote:

...

Most of the professional spams I've seen lately seem to have only the spammer's own domain in them.

Really? Most of those I've seen lately have the spammers (obfuscated) domain and three bogus domains constructed out of the victim's localpart, e.g. http://schampeo.org, http://schampeo.net, http://schampeo.com - it's become so reliable a filter that I am toying around with skipping all SURBL checks on any such message.

-- join us! http://hesketh.com/about/careers/web_designer.html join us! hesketh.com/inc. v: +1(919)834-2552 f: +1(919)834-2554 w: http://hesketh.com join us! http://hesketh.com/about/careers/account_manager.html join us!

Jeff Chan

1 Oct 1 Oct

1:18 a.m.

On Thursday, September 30, 2004, 7:33:35 AM, Steven Champeon wrote:

...

on Wed, Sep 29, 2004 at 06:27:37PM -0700, Jeff Chan wrote:

...
Most of the professional spams I've seen lately seem to have only the spammer's own domain in them.

...

Really? Most of those I've seen lately have the spammers (obfuscated) domain and three bogus domains constructed out of the victim's localpart, e.g. http://schampeo.org, http://schampeo.net, http://schampeo.com - it's become so reliable a filter that I am toying around with skipping all SURBL checks on any such message.

Yeah I've seen some of those too. They didn't seem to be in the majority of my spams though.

Jeff C. -- "If it appears in hams, then don't list it."

David B Funk

30 Sep 30 Sep

3:59 a.m.

On Wed, 29 Sep 2004, Jeff Chan wrote:

...

On Wednesday, September 29, 2004, 4:48:36 PM, Raymond Dijkxhoorn wrote:

...
Hi!

...
...
I have a similar idea. Would it be possible to have a running list of the top 20 (or so... 50? 100?) most often queried URI's that are blocked by SURBL (and which should be blocked)? This way, we could take additional pressure off SURBL DNS servers by blacklisting these domains locally BEFORE doing SURBL checking on such messages?

I have a feeling that this has already been requested and implemented??

...
Its something that is suggested, and we are looking into ways to getting that inside for example SA3.1, the SA guys also had some suggestions.

...
So yes, excellent idea... ;)

Yes, SA is adding a feature to hardcode or have a database of the 125 most often hit whitelist domains to 3.1 or 3.0.1. This will prevent domains like w3.org, yahoo.com, etc. from even being queried.

One issue is with the top spammer domains is that unlike the whitehats, the big spammer domains tend to change over time. The biggest spammers also seem to be the most dynamic.

So the whitehats may work better with local listing than the blackhats.

This is certainly a good idea though. Note that Eric Kolve also built in local black and whitelists to SpamCopURI.

Jeff C.

A far better way to effect this is to just increase the TTL on those long-term blackhat domains. (A static list is effectively an infinitly large TTL, query once and keep for ever).

The static whitelist is necessary as it is not possible to tune the per-query NAK TTL, and you want the general NAK TTL to be low to improve responsiveness of "add" events. (Hmm, there's a thought, modify a DNS server to hand out customized TTLs on particular NAK responses).

You -can- tune the postive TTL on a per entry basis. So for the long-term blackhats just give them a large (say 24 hours) TTL and their querys will drop way down.

This presupposes that all client sites are running a locally caching DNS server. Any body -not- doing that should be banished and dis-allowed from using SURBL.

-- Dave Funk University of Iowa <dbfunk (at) engineering.uiowa.edu> College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include <std_disclaimer.h> Better is not better, 'standard' is better. B{

Jeff Chan

4:11 a.m.

On Wednesday, September 29, 2004, 6:59:51 PM, David Funk wrote:

...

On Wed, 29 Sep 2004, Jeff Chan wrote:

...

...
One issue is with the top spammer domains is that unlike the whitehats, the big spammer domains tend to change over time. The biggest spammers also seem to be the most dynamic.

So the whitehats may work better with local listing than the blackhats.

...

A far better way to effect this is to just increase the TTL on those long-term blackhat domains. (A static list is effectively an infinitly large TTL, query once and keep for ever).

But did you see my other comments? The top blackhats change domains frequently. The biggest spammers appear to be the ones to change domains the most often. Their domains only stay on the top of the heap for a few days. So in terms of the biggest spammers, I don't think there are long term ones.

BIND definitely supports per-record positive TTLs, but rbldnsd I think doesn't at least in the dnset type of zone files we use for SURBLs, and the majority of public name servers are using rbldnsd.

Jeff C. -- "If it appears in hams, then don't list it."

David B Funk

6:15 a.m.

On Wed, 29 Sep 2004, Jeff Chan wrote:

...

On Wednesday, September 29, 2004, 6:59:51 PM, David Funk wrote:

...
A far better way to effect this is to just increase the TTL on those long-term black-hat domains. (A static list is effectively an infinitely large TTL, query once and keep for ever).

But did you see my other comments? The top black-hats change domains frequently. The biggest spammers appear to be the ones to change domains the most often. Their domains only stay on the top of the heap for a few days. So in terms of the biggest spammers, I don't think there are long term ones.

Yes, I did see your comments, including the one where you thought that having a static blacklist was "certainly a good idea though"

Did you understand my comments to the effect that a static blacklist was effectively the same as a -very- large TTL?

If you are against having a large TTL for selected black-hats then you should be screaming -against- the whole concept of a static blacklist.

My point was that intelligent use of TTLs will give improvement of DNS traffic without the inherent inflexibility of static lists. (Not to mention the problem of dealing with FPs that are cast in the concrete of static blacklist files all over the net).

...

BIND definitely supports per-record positive TTLs, but rbldnsd I think doesn't at least in the dnset type of zone files we use for SURBLs, and the majority of public name servers are using rbldnsd.

Jeff C.

TANSTAAFL

There is a reason for BIND being a resource hog.

20 years ago BIND was small/light-weight (I deployed my first BIND server in 1987). It grew in size/weight not because some developer wanted to craft 'bloat-ware' but because of the demands of a growing Internet (growing in size, meanness, etc).

If you want industrial grade features then maybe you need to consider using industrial strength software.

Jeff Chan

6:34 a.m.

On Wednesday, September 29, 2004, 9:15:33 PM, David Funk wrote:

...

On Wed, 29 Sep 2004, Jeff Chan wrote:

...

...
On Wednesday, September 29, 2004, 6:59:51 PM, David Funk wrote:

...
A far better way to effect this is to just increase the TTL on those long-term black-hat domains. (A static list is effectively an infinitely large TTL, query once and keep for ever).

But did you see my other comments? The top black-hats change domains frequently. The biggest spammers appear to be the ones to change domains the most often. Their domains only stay on the top of the heap for a few days. So in terms of the biggest spammers, I don't think there are long term ones.

...

Yes, I did see your comments, including the one where you thought that having a static blacklist was "certainly a good idea though"

...

Did you understand my comments to the effect that a static blacklist was effectively the same as a -very- large TTL?

...

If you are against having a large TTL for selected black-hats then you should be screaming -against- the whole concept of a static blacklist.

I was just being generous. A static whitelist makes vastly more sense than a static blacklist. The top whitelist entries change very little over the course of months. For example, neither yahoo.com nor w3.org are going away any time soon. In contrast, the top blacklist entries change daily as the biggest spammers abandon their domains and move to others.

...

My point was that intelligent use of TTLs will give improvement of DNS traffic without the inherent inflexibility of static lists. (Not to mention the problem of dealing with FPs that are cast in the concrete of static blacklist files all over the net).

A static whitelist and regular DNS service of a blacklist probably approach the ideal. Blacklist entries with long TTLs don't make as much sense for the reasons given above and earlier.

...

...
BIND definitely supports per-record positive TTLs, but rbldnsd I think doesn't at least in the dnset type of zone files we use for SURBLs, and the majority of public name servers are using rbldnsd.

Jeff C.

...

TANSTAAFL

...

There is a reason for BIND being a resource hog.

...

20 years ago BIND was small/light-weight (I deployed my first BIND server in 1987). It grew in size/weight not because some developer wanted to craft 'bloat-ware' but because of the demands of a growing Internet (growing in size, meanness, etc).

...

If you want industrial grade features then maybe you need to consider using industrial strength software.

If the Internet depended on BIND for RBLs, then RBLs would probably be unworkable. The memory and cpu requirements for rbldnsd are much less than BIND, and rbldnsd responds to queries at least twice as quickly as BIND.

rbldnsd is a more appropriate solution to RBLs than BIND. It's smaller, leaner and much better suited to the task.

Jeff C. -- "If it appears in hams, then don't list it."

7582

Age (days ago)

7583

Last active (days ago)

discuss@lists.surbl.org

14 comments

7 participants

tags (0)

participants (7)

Andy Warner
David B Funk
Jeff Chan
Raymond Dijkxhoorn
Rob McEwen
Steven Champeon
Theo Van Dinter