use of surbl to check non-body content?

List overview All Threads
Download

newer

older

uk DOT geocities DOT com

Re: [SURBL-Discuss] use of surbl...

Steven Champeon

11 Oct 2005 11 Oct '05

7:42 p.m.

I've noticed that SURBL (and URIBL, who I will contact later) lists several domains that have appeared in spam header contents as well as in body contents. I'd like to use SURBL (probably multi) as an optional domains BL check against headers known to contain domains, such as the Message-ID, From, and Reply-To headers, a la

Message-Id: 200510020442.j924gBkv021479@expoactive.net From: ExpoActive advertising@expoactive.net Reply-To: advertising@expoactive.net

From: "Steven McGuire" stevenmcguire@aaaaa2.com List-Unsubscribe: mailto:leave-2005_1-6m_optin-10289508G@aaaaa2.com Message-Id: LYRIS-10289508-169-2005.10.03-20.50.13--{vic#tim}@aaaaa2.com

From: "iMarketing Sales Leads" julieandrews@imailzone.info

Reply-To: "OAG" club@reachmail.net

From: TuneUp Software Newsletter newsletter@tune-up.com Reply-To: newsletter4v2-reply@newsletter.tune-up.com

From: "Solutions" info@disklesspc.com Reply-To: info@disklesspc.com

From: "Millionaires Concierge" info@millionaires-concierges1.com Reply-To: info@millionaires-concierges1.com

Message-Id: 200510020442.j924gBkv021479@expoactive.net From: ExpoActive advertising@expoactive.net Reply-To: advertising@expoactive.net

As I've only received 23 spams not otherwise classifiable as worth blocking using other means (e.g., 419 scams which can be blocked by injection IP) this /month/, having successfully blocked all the rest, I'd really like to take advantage of the realtime nature of SURBLs.

I could see immediate results in the form of blocking literally 1/3 of the remaining spam I allow in here.

Comments? This would be an optional configuration for my enemieslist package, which I intend to have more widespread distribution eventually but which would not represent a crushing query load at present.

-- hesketh.com/inc. v: +1(919)834-2552 f: +1(919)834-2554 w: http://hesketh.com antispam news, solutions for sendmail, exim, postfix: http://enemieslist.com/

Show replies by date

Jeff Chan

12 Oct 12 Oct

1:55 a.m.

On Tuesday, October 11, 2005, 10:42:28 AM, Steven Champeon wrote:

...

I've noticed that SURBL (and URIBL, who I will contact later) lists several domains that have appeared in spam header contents as well as in body contents. I'd like to use SURBL (probably multi) as an optional domains BL check against headers known to contain domains, such as the Message-ID, From, and Reply-To headers, a la

...

Message-Id: 200510020442.j924gBkv021479@expoactive.net From: ExpoActive advertising@expoactive.net Reply-To: advertising@expoactive.net

Are these spams being sent from zombies? If not, then we possibly should not be listing them. If they're sending from their own mailservers then it's vastly more efficient to just block their IPs at a low level, i.e., regular (local or global) RBL.

Regarding using SURBLs on headers, I guess I'd view that as mission creep and somewhat away from our original focus of URI domains.

Do any spam gangs put the URI domain on their headers when they use zombies? Seems to me they tend to forge everything except the URI.

Jeff C. -- Don't harm innocent bystanders.

Steven Champeon

5:54 a.m.

on Tue, Oct 11, 2005 at 04:55:30PM -0700, Jeff Chan wrote:

...

On Tuesday, October 11, 2005, 10:42:28 AM, Steven Champeon wrote:

...
I've noticed that SURBL (and URIBL, who I will contact later) lists several domains that have appeared in spam header contents as well as in body contents. I'd like to use SURBL (probably multi) as an optional domains BL check against headers known to contain domains, such as the Message-ID, From, and Reply-To headers, a la

...
Message-Id: 200510020442.j924gBkv021479@expoactive.net From: ExpoActive advertising@expoactive.net Reply-To: advertising@expoactive.net

Are these spams being sent from zombies? If not, then we possibly should not be listing them. If they're sending from their own mailservers then it's vastly more efficient to just block their IPs at a low level, i.e., regular (local or global) RBL.

You misunderstand me, I think. I'm not deliberately listing any domains in SURBLs, I'm proposing using the SURBLs DNS zone (e.g. "multi") to check domains that may be embedded in headers such as From, Reply-To, and Message-Id, where they are often used to direct bounces and replies back to the domain owners, while evading the meager blocks on sender host/domain and/or SMTP Rcpt To, or used as tracking devices.

...

Regarding using SURBLs on headers, I guess I'd view that as mission creep and somewhat away from our original focus of URI domains.

I'm not asking for SURBLs to list domains found in headers, I'm suggesting that domains found in SURBLs because of their use in the bodies of spam may also be found on occasion in less-inspected message headers of spam that may also find them in the body.

I'm just trying to reduce my spam inspection workload here by using reliable sources of known spammy domains to allow rejection of the message without body inspection (which in SA and procmail, et al requires that the message be accepted and inspection undertaken prior to delivery). I estimate that some 30% or more of spam we'd accepted and delivered or quarantined could have been rejected during the SMTP conversation, using SURBLs.

...

Do any spam gangs put the URI domain on their headers when they use zombies? Seems to me they tend to forge everything except the URI.

I don't know. But I do know that spammer domains - listed in SURBL and URIBL already - do tend to be found in headers likely to direct replies back to the spammer, and which may contain tracking devices also useful to the spammer (when inserted by compliant clients as References: or In-Reply-To: in the reply). I'm advocating rejecting these known spammy messages, which would otherwise be caught/tagged by SURBLs after delivery (and delivered or quarantined, after which it's in the hands of users to know whether or not to reply to ask to be removed), during the SMTP conversation, not after.

-- hesketh.com/inc. v: +1(919)834-2552 f: +1(919)834-2554 w: http://hesketh.com antispam news, solutions for sendmail, exim, postfix: http://enemieslist.com/

Rob McEwen

6:18 a.m.

Steven,

I have found through experience that the FP rate is considerably higher when checking headers with SURBL. I can't even recall ALL the reasons why... but I know empirically... from actually experience... that this is true. (especially with IP addresses)

Also, because checking against headers results in more FPs and because this is not the official prescribed method, if you ever report such a FP, please be sure to mention that the URI was found in the header and that you **know** that checking such is not the official way of doing things.

This will save you from getting lectured and it will help SURBL folks to not mis-apply your evidence. For example, there are **some** FPs that will be triggered by using SURBL on headers where that URI **NEVER** appears in the body of legit messages, even though it might appear in the header of a legit message. In such a situation, it would be correct to keep such listed in SURBL. Get the idea?

Finally, I DO check headers against SURBL, just as you've described... but I weight it much less than SURBL-caught URIs in the body of the message. And I closely audit such mail... much more closely than regular SURBL-blocked messages.

Rob McEwen

Jeff Chan

7:09 a.m.

On Tuesday, October 11, 2005, 9:18:47 PM, Rob McEwen wrote:

...

Steven,

...

I have found through experience that the FP rate is considerably higher when checking headers with SURBL. I can't even recall ALL the reasons why... but I know empirically... from actually experience... that this is true. (especially with IP addresses)

I'm puzzled why there would be FPs. Are hammers forging spam domains in their headers? That would seem bizarre if so.

...

Also, because checking against headers results in more FPs and because this is not the official prescribed method, if you ever report such a FP, please be sure to mention that the URI was found in the header and that you **know** that checking such is not the official way of doing things.

Actually I'd suggest just reporting FPs from message bodies, but would be interested in hearing about FPs from headers, even though that's not the intended use.

Jeff C. -- Don't harm innocent bystanders.

ariel＠spambouncer.org

6:24 p.m.

...

I have found through experience that the FP rate is considerably higher when checking headers with SURBL. I can't even recall ALL the reasons why... but I know empirically... from actually experience... that this is true. (especially with IP addresses)

I can second this. I did some testing just for my own curiousity a while back because plenty of spammers, even those sending through open proxies, use their own registered domains to HELO and/or in the From: line. However, I found that the following stuff causes FPs:

* With IPs, often an IP that is used directly in a spam is on a hacked server that is not supposed to be a web server. If you block URIs on the IP, you get no false positives. If you block on headers, you often do, especially if you block on all foreign IPs rather than the first external IP/"handoff IP". That's because these trojaned servers also have a "real life" as something else, often a DNS or mail server (if a server), or a user workstation. If you deliberately want to cause FPs to pressure the owners to clean up the trojan, fine, but that is not what SURBLs are intended to do. That's SPEWS or (to a much lesser extent) the SBL.

* With domains, phishers and (increasingly) spammers are hosting web pages on hacked servers and using that server's domain in the spam URI. If you bloc on headers, again, you have a real risk of blocking legitimate email.

I'd like to see a larger, more conservatively run RHSBL for headers than the AHBL RHSBL. But right now, there isn't one.

-- Catherine Hampton ariel@spambouncer.org The SpamBouncer * http://www.spambouncer.org/ Personal Home Page * http://www.devsite.org/

Jeff Chan

7:01 a.m.

On Tuesday, October 11, 2005, 8:54:26 PM, Steven Champeon wrote:

...

I'm not asking for SURBLs to list domains found in headers, I'm suggesting that domains found in SURBLs because of their use in the bodies of spam may also be found on occasion in less-inspected message headers of spam that may also find them in the body.

...

I'm just trying to reduce my spam inspection workload here by using reliable sources of known spammy domains to allow rejection of the message without body inspection (which in SA and procmail, et al requires that the message be accepted and inspection undertaken prior to delivery). I estimate that some 30% or more of spam we'd accepted and delivered or quarantined could have been rejected during the SMTP conversation, using SURBLs.

...

I do know that spammer domains - listed in SURBL and URIBL already - do tend to be found in headers likely to direct replies back to the spammer, and which may contain tracking devices also useful to the spammer (when inserted by compliant clients as References: or In-Reply-To: in the reply). I'm advocating rejecting these known spammy messages, which would otherwise be caught/tagged by SURBLs after delivery (and delivered or quarantined, after which it's in the hands of users to know whether or not to reply to ask to be removed), during the SMTP conversation, not after.

Sounds reasonable, even if it's not the original purpose of SURBLs.

What kinds of percentage of spam message header domains are showing up on SURBLs? I would imagine the hit rates might not be too high, so there may be a processing cost/benefit issue.

Jeff C. -- Don't harm innocent bystanders.

Rob McEwen

7:33 a.m.

Jeff asked:

...

What kinds of percentage of spam message header domains are showing up on SURBLs? I would imagine the hit rates might not be too high, so there may be a processing cost/benefit issue.

...and...

...

I'm puzzled why there would be FPs. Are hammers forging spam domains in their headers? That would seem bizarre if so.

I have to correct something... I misspoke. I **used** to use SURBLs for checking headers. I had forgotten that I had stopped doing so a few months ago because (1) too many FPs (for my admitted strict standards), and (2) I made enough great improvements in others parts of my filtering that I felt I could back off on the SURBL-checking of headers.

(I was just too tired to think straight about this in my last e-mail).

But, let me mention that the overall FP rate is still very, very low. It was like 1/200 FPs, or less. (but I'm guessing)

Most often, if a FP occurred, it was because an IP address used in a spammer's URL would, for whatever reason, also appear in the headers of legit messages.

Also, have you ever seen those e-mails where some guy e-mails ALL 90 of his friends using outlook? Every once in a while, such an e-mail would pass through my server where one of these friends would be an employee of a spamming organization... thus triggering the FP. Of course, these tended to be the more marginally listed domains of SURBL... not the Russian pill spammers, but it still happened on rare occasion.

I recall catching about 50 extra spams a day on my 10K messages/day server by checking the header against SURBL. Statistically, not that much, but every 1/2 percent counts for something and these were ones which, at that time, wouldn't have been caught otherwise.

...

From a processing perspective, I don't think it is that big a deal. What I

found to be really slow (that I also used to do but no longer do) is to convert domains to IPs and check these against spamhaus. The problem here is that some domains take a LONG time to convert to IP because of delays on that domain's DNS server. This method also caught about 50 extra spams per day... but at too high a processing cost.

I don't think that processing SURBL against headers was a big processing drain... but the FPs were too high for my very strict tastes. Still, it is a VERY good indicator of spam and might work well if integrated into a scoring system and not outright blocked for that alone.

Rob McEwen

Jeff Chan

7:49 a.m.

On Tuesday, October 11, 2005, 10:33:55 PM, Rob McEwen wrote:

...

Jeff asked:

...

...
What kinds of percentage of spam message header domains are showing up on SURBLs? I would imagine the hit rates might not be too high, so there may be a processing cost/benefit issue.

...and...

...
I'm puzzled why there would be FPs. Are hammers forging spam domains in their headers? That would seem bizarre if so.

[...]

...

But, let me mention that the overall FP rate is still very, very low. It was like 1/200 FPs, or less. (but I'm guessing)

...

Most often, if a FP occurred, it was because an IP address used in a spammer's URL would, for whatever reason, also appear in the headers of legit messages.

Huh? SURBLs are mostly domains. Were you resolving SURBL domains then checking resolved IPs against header IPs? That would be, ahem, unusual.

Jeff C. -- Don't harm innocent bystanders.

Steven Champeon

4 p.m.

on Wed, Oct 12, 2005 at 01:33:55AM -0400, Rob McEwen wrote:

...

I have to correct something... I misspoke. I **used** to use SURBLs for checking headers. I had forgotten that I had stopped doing so a few months ago because (1) too many FPs (for my admitted strict standards), and (2) I made enough great improvements in others parts of my filtering that I felt I could back off on the SURBL-checking of headers.

I'd only be checking the From:, Reply-To:, and Message-Id: (and, possibly, if I were to find a reason to do so, References: and In-Reply-To:), not the Received: or To: or Cc: etc. By "find a reason" I usually mean "get pissed that I got spam I could have blocked by the proper and appropriate application of just one more check" ;)

I'll admit I share JeffC's confusion about why legit mail would contain known spammer domains in the headers, but it sounds like you were more referring to IPs that had been the result of resolving a spammy domain, right?

...

Most often, if a FP occurred, it was because an IP address used in a spammer's URL would, for whatever reason, also appear in the headers of legit messages.

OK. Where in the headers? Do you recall? (No biggie if you can't)

...

I recall catching about 50 extra spams a day on my 10K messages/day server by checking the header against SURBL. Statistically, not that much, but every 1/2 percent counts for something and these were ones which, at that time, wouldn't have been caught otherwise.

Good, that's what I'm hoping for. I'm literally down to <10/day, less than that if you consider 419 spam the price of allowing hotmail to relay to any of your users :/ I'd like to achieve a spam-free day here, and I'm looking for the last in the line of defenses, without accepting and analyzing the messages.

...

I don't think that processing SURBL against headers was a big processing drain... but the FPs were too high for my very strict tastes. Still, it is a VERY good indicator of spam and might work well if integrated into a scoring system and not outright blocked for that alone.

My test implementation simply "tags" suspected messages with a header for filtering via procmail. I haven't seen any hits or FPs yet, but it's early days. But if my analysis is correct, it could mean as much as 1/3 of the spam I let in so far this month could have been caught and rejected.

-- hesketh.com/inc. v: +1(919)834-2552 f: +1(919)834-2554 w: http://hesketh.com antispam news, solutions for sendmail, exim, postfix: http://enemieslist.com/

Steven Champeon

2:43 p.m.

on Tue, Oct 11, 2005 at 10:01:29PM -0700, Jeff Chan wrote:

...

...
I do know that spammer domains - listed in SURBL and URIBL already - do tend to be found in headers likely to direct replies back to the spammer, and which may contain tracking devices also useful to the spammer (when inserted by compliant clients as References: or In-Reply-To: in the reply). I'm advocating rejecting these known spammy messages, which would otherwise be caught/tagged by SURBLs after delivery (and delivered or quarantined, after which it's in the hands of users to know whether or not to reply to ask to be removed), during the SMTP conversation, not after.

Sounds reasonable, even if it's not the original purpose of SURBLs.

What kinds of percentage of spam message header domains are showing up on SURBLs? I would imagine the hit rates might not be too high, so there may be a processing cost/benefit issue.

Well, I don't allow much spam into my network - I reject it all as best I can. For reliable numbers, you'd need to ask someone with a large spam corpus. But of the 25 spams I let in so far this month (which doesn't count 419 scams, most of which came in via hotmail) 8 of them would have been blockable using uribl/surbl lookups. I figure 32% is a good enough number to at least try the approach.

-- hesketh.com/inc. v: +1(919)834-2552 f: +1(919)834-2554 w: http://hesketh.com antispam news, solutions for sendmail, exim, postfix: http://enemieslist.com/

7420

Age (days ago)

7421

Last active (days ago)

discuss@lists.surbl.org

10 comments

4 participants

tags (0)

participants (4)

ariel＠spambouncer.org
Jeff Chan
Rob McEwen
Steven Champeon