Spammer Anti-SURBL tactic

List overview All Threads
Download

newer

older

Nice article on SURBL and note...

RE: [SURBL-Announce] Welcoming new...

David B Funk

22 Feb 2005 22 Feb '05

11:35 a.m.

I'm seeing a new spam varient that is clearly designed to get past SURBL. It is an HTML message that contains many (50~100) 'invisible' links; links that have no target text, just: <A href="http://garbage.sitename.tld"></A>

The intention is clear, they want to fill up the 20 'slots' of the spamcop_uri_limit with their junk links so the real "payload" URL can slip past unchecked. That's playing a statistical game, there's a 1 in 20 chance of the "payload" getting picked by the randomizer but that means that 95% slip by.

To add insult to injury, they're tossing in random "\r" (ASCII-CR) characters into the "payload" hostname to try to break spamassasin's URI parsing.

Is it time to create rules to penalize large numbers of 'invisible' links?

The one thing that has me worried is that people may just start cranking up the spamcop_uri_limit value to do a brute-force response to this trash (or have a simple-minded client that doesn't have that kind of limit). This will add an ever-increasing load on the SURBL dns servers. I'm already seeing a steady-state average of 130 queries/second against my two servers (with spikes in the 150~175) range. The trend has been a steady increase (passed the 100 Q/S mark last fall).

-- Dave Funk University of Iowa <dbfunk (at) engineering.uiowa.edu> College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include <std_disclaimer.h> Better is not better, 'standard' is better. B{

Show replies by date

Robert Brooks

22 Feb 22 Feb

3:19 p.m.

David B Funk wrote:

...

I'm seeing a new spam varient that is clearly designed to get past SURBL. It is an HTML message that contains many (50~100) 'invisible' links; links that have no target text, just: <A href="http://garbage.sitename.tld"></A>

...

Is it time to create rules to penalize large numbers of 'invisible' links?

it would also be good to discard pointless links before querying surbl's, not sure how easy that is going to be to code though

-- Robert Brooks, Network Manager, Cable & Wireless UK robb@hyperlink-interactive.co.uk http://hyperlink-interactive.co.uk/ Tel: +44 (0)20 7339 8600 Fax: +44 (0)20 7339 8601 - Help Microsoft stamp out piracy. Give Linux to a friend today! -

Jeff Chan

23 Feb 23 Feb

7:39 a.m.

On Tuesday, February 22, 2005, 6:19:06 AM, Robert Brooks wrote:

...

David B Funk wrote:

...
I'm seeing a new spam varient that is clearly designed to get past SURBL. It is an HTML message that contains many (50~100) 'invisible' links; links that have no target text, just: <A href="http://garbage.sitename.tld"></A>

...

...
Is it time to create rules to penalize large numbers of 'invisible' links?

...

it would also be good to discard pointless links before querying surbl's, not sure how easy that is going to be to code though

Yes, there is a SpamAssassin bugzilla with a feature request to ignore unclickable URIs:

http://bugzilla.spamassassin.org/show_bug.cgi?id=3976

Jeff C. -- "If it appears in hams, then don't list it."

Cris Fuhrman

22 Feb 22 Feb

3:23 p.m.

On Tue, 22 Feb 2005 04:35:51 -0600 (CST), David B Funk dbfunk@engineering.uiowa.edu wrote:

...

I'm seeing a new spam varient that is clearly designed to get past SURBL. It is an HTML message that contains many (50~100) 'invisible' links; links that have no target text, just: <A href="http://garbage.sitename.tld"></A>

The intention is clear, they want to fill up the 20 'slots' of the spamcop_uri_limit with their junk links so the real "payload" URL can slip past unchecked. That's playing a statistical game, there's a 1 in 20 chance of the "payload" getting picked by the randomizer but that means that 95% slip by.

To add insult to injury, they're tossing in random "\r" (ASCII-CR) characters into the "payload" hostname to try to break spamassasin's URI parsing.

Because of all these games that are played to break the parser, I discussed an idea a while back on the SpamCop newsgroups that looked at using Java (or some other API, maybe with Internet Explorer) to render a spam's HTML into a virtual page and then scan its Document Objects (post HTML parsing) one at a time for links. It's similar to what a user would "see" in a browser.

I've a hunch that "null" links, strange parsing, etc. will be handled correctly by the DOM parser for HTML, but I've never done any tests for lack of time. Java API could be called under linux, but IE's? Just an idea... I'm sure the spammers could figure out how to get around that method, too. But the trick is, their HTML still has to show up correctly to the user for the spam to work.

Joe Wein

23 Feb 23 Feb

12:51 p.m.

"David B Funk" dbfunk@engineering.uiowa.edu wrote:

...

I'm seeing a new spam varient that is clearly designed to get past SURBL. It is an HTML message that contains many (50~100) 'invisible' links; links that have no target text, just: <A href="http://garbage.sitename.tld"></A>

In my spamfilter I check for this pattern and penalise any mail for including <a href=...></a> with no anchor text (you have to be careful with the parsing though, so as not to penalise <a name="URI"></a> which is legit).

Also quite common is to have a single non-alphabetic character as the anchor text, e.g

etc.

...

To add insult to injury, they're tossing in random "\r" (ASCII-CR) characters into the "payload" hostname to try to break spamassasin's URI parsing.

I strip out any CR/LF characters between the opening and closing double quote of a <a href=...> URL.

The next update of jwSpamSpy for Windows will query SURBL, which means it's coming full circle, since it is the tool that actually extracts and provides much of the JP domain data feed of SURBL :-)

Joe Wein

-- joewein.de LLC Yokohama, Japan POP3 Spamfilter for Windows 2000/XP http://www.joewein.de/sw/jwSpamSpy

Jeff Chan

1:11 p.m.

On Wednesday, February 23, 2005, 3:51:08 AM, Joe Wein wrote:

...

The next update of jwSpamSpy for Windows will query SURBL, which means it's coming full circle, since it is the tool that actually extracts and provides much of the JP domain data feed of SURBL :-)

Hi Joe, While it's nice that you want to build SURBLs into jwSpamSpy, we somewhat prefer that message processing be done in mail servers instead of mail clients particularly in order to minimize name server hits.

Jeff C. -- "If it appears in hams, then don't list it."

Frank Ellermann

5:03 p.m.

Jeff Chan wrote:

...

we somewhat prefer that message processing be done in mail servers instead of mail clients particularly in order to minimize name server hits.

Are DNS caches used by MTAs "better" than other DNS caches ?

Bye, Frank

Jeff Chan

5:26 p.m.

On Wednesday, February 23, 2005, 8:03:54 AM, Frank Ellermann wrote:

...

Jeff Chan wrote:

...

...
we somewhat prefer that message processing be done in mail servers instead of mail clients particularly in order to minimize name server hits.

...

Are DNS caches used by MTAs "better" than other DNS caches ?

Yes, because they see more mail on a single server and therefore should have higher cache hit rates.

Jeff C. -- "If it appears in hams, then don't list it."

Frank Ellermann

24 Feb 24 Feb

12:53 a.m.

Jeff Chan wrote:

...

...
Are DNS caches used by MTAs "better" than other DNS caches ?

...

Yes, because they see more mail on a single server and therefore should have higher cache hit rates.

I use the DNS server(s) assigned by my ISP(s), the MX(s) of my ISP(s) could use the same DNS servers. Maybe not if it is a very big ISP like T-Online, but for claranet.de they're probably the same servers. Bye, Frank

Jose Marcio Martins da Cruz

10 a.m.

Jeff Chan wrote:

...

On Wednesday, February 23, 2005, 8:03:54 AM, Frank Ellermann wrote:

...

...
Are DNS caches used by MTAs "better" than other DNS caches ?

Yes, because they see more mail on a single server and therefore should have higher cache hit rates.

Also, many spams are sent in a single message to many recipients. If check is done at mail server, only one query per URL will be done. At the same time, huge mail servers have a cache DNS running on the same machine - so faster queries.

Joe

-- --------------------------------------------------------------- Jose Marcio MARTINS DA CRUZ Tel. :(33) 01.40.51.93.41 Ecole des Mines de Paris http://j-chkmail.ensmp.fr 60, bd Saint Michel http://www.ensmp.fr/~martins 75272 - PARIS CEDEX 06 mailto:Jose-Marcio.Martins@ensmp.fr

Joe Wein

2:55 a.m.

"Jeff Chan" jeffc@surbl.org wrote

...

Hi Joe, While it's nice that you want to build SURBLs into jwSpamSpy, we somewhat prefer that message processing be done in mail servers instead of mail clients particularly in order to minimize name server hits.

Hi Jeff,

I appreciate those concerns, but I have tried to address them in my client design as much as possible, to minimize any impact on SURBL compared to a server-based approach:

1) I only use SURBL if I can't verify an email as ham or spam using any other methods, such as sender whitelists, local domain blacklists, SBL records, ratware signatures, etc. jwSpamSpy is capable of detecting and tracking most of the pill / porn / warez domains etc. without having to resort to SURBL -- after all is the engine that provides a lot of the SURBL data :-)

2) I keep an extensive local whitelist of domains (several 1000s) which are never externally queried

3) I perform local DNS caching, which will eliminate a lot of duplicate queries on the wire

4) After that I go through the DNS server of the provider which will cache data; the client never directly connects to multi.surbl.org.

5) By default my filter checks for new mail every 10 minutes, less than the 15 minute TTL on SURBL. It's quasi-realtime. Therefore there should be no major time lag between delivery to the ISP mailserver and the SURBL query issued by the client, which I think was your main concern with a client-based approach. My mail polling interval is conveniently shorter than the SURBL TTL :-)

Having said that, my long-term plan is to also to offer a server-based solution that could run on Linux and other platforms. The existing client already supports multiple pop accounts on the same box, which obviously all go through the same DNS caching, etc. as they would on a server-based version.

Joe

-- joewein.de LLC Yokohama, Japan POP3 Spamfilter for Windows 2000/XP http://www.joewein.de/sw/jwSpamSpy

Jeff Chan

3:51 a.m.

On Wednesday, February 23, 2005, 5:55:19 PM, Joe Wein wrote:

...

"Jeff Chan" jeffc@surbl.org wrote

...
Hi Joe, While it's nice that you want to build SURBLs into jwSpamSpy, we somewhat prefer that message processing be done in mail servers instead of mail clients particularly in order to minimize name server hits.

...

Hi Jeff,

...

I appreciate those concerns, but I have tried to address them in my client design as much as possible, to minimize any impact on SURBL compared to a server-based approach:

...

I only use SURBL if I can't verify an email as ham or spam using any

other methods, such as sender whitelists, local domain blacklists, SBL records, ratware signatures, etc. jwSpamSpy is capable of detecting and tracking most of the pill / porn / warez domains etc. without having to resort to SURBL -- after all is the engine that provides a lot of the SURBL data :-)

...

I keep an extensive local whitelist of domains (several 1000s) which are

never externally queried

...

I perform local DNS caching, which will eliminate a lot of duplicate

queries on the wire

...

After that I go through the DNS server of the provider which will cache

data; the client never directly connects to multi.surbl.org.

...

By default my filter checks for new mail every 10 minutes, less than the

15 minute TTL on SURBL. It's quasi-realtime. Therefore there should be no major time lag between delivery to the ISP mailserver and the SURBL query issued by the client, which I think was your main concern with a client-based approach. My mail polling interval is conveniently shorter than the SURBL TTL :-)

...

Having said that, my long-term plan is to also to offer a server-based solution that could run on Linux and other platforms. The existing client already supports multiple pop accounts on the same box, which obviously all go through the same DNS caching, etc. as they would on a server-based version.

...

Joe

Thanks Joe, it all sounds pretty reasonable to me.

Jeff C. -- "If it appears in hams, then don't list it."

Steven Champeon

8 Mar 8 Mar

6:07 a.m.

Speaking of anti-SURBL tactics, I got this turdlet today (snippet of HTML email below):

<DIV>We are giving out Free Import / Export / Wholesales/ Distributers / Retailers  Contact Database</DIV> <DIV> </DIV> <DIV>If You interested Pls get at Following URL</DIV> <DIV> </DIV> <DIV><A onmouseover="window.status='http://www.impexp-data.com';return true;" onmouseout="window.status=' ';return true;" href="http://indigisys.com/chawla1/open.htm" target=_blank>Business = Database</A> </DIV> <DIV> </DIV> <DIV>Free Business / Marketing Tools ( Free SMS to All over world Unl= imited ) </DIV> <DIV><A onmouseover="window.status='http://www.impexp-data.com/sms';return true;" onmouseout="window.status=' ';return true;" href="http://indigisys.com/chawla1/open.htm" target=_blank>FREE SMS = Tools </A></DIV>

It *looks* like whoever owns indigisys.com wants to hide the fact that they're actually indigisys.com by pretending to be impexp-data.com, which doesn't exist. Does SURBL's lookup code catch this?

-- hesketh.com/inc. v: +1(919)834-2552 f: +1(919)834-2554 w: http://hesketh.com join us! http://hesketh.com/about/careers/account_manager.html join us!

Jeff Chan

8:22 a.m.

On Monday, March 7, 2005, 9:07:37 PM, Steven Champeon wrote:

...

Speaking of anti-SURBL tactics, I got this turdlet today (snippet of HTML email below):

...

<DIV>We are giving out Free Import / Export / Wholesales/ Distributers / Retailers  Contact Database</DIV> <DIV> </DIV> <DIV>If You interested Pls get at Following URL</DIV> <DIV> </DIV> <DIV><A onmouseover="window.status='http://www.impexp-data.com';return true;" onmouseout="window.status=' ';return true;" href="http://indigisys.com/chawla1/open.htm" target=_blank>Business = Database</A> </DIV> <DIV> </DIV> <DIV>Free Business / Marketing Tools ( Free SMS to All over world Unl= imited ) </DIV> <DIV><A onmouseover="window.status='http://www.impexp-data.com/sms';return true;" onmouseout="window.status=' ';return true;" href="http://indigisys.com/chawla1/open.htm" target=_blank>FREE SMS = Tools </A></DIV>

...

It *looks* like whoever owns indigisys.com wants to hide the fact that they're actually indigisys.com by pretending to be impexp-data.com, which doesn't exist. Does SURBL's lookup code catch this?

SpamAssassin 2.64 running SpamCopURI seems to check both domains:

debug: checking url: http://indigisys.com/chawla1/open.htm debug: returning cached data : indigisys.com.multi.surbl.org -> ARRAY(0x9351f4c) debug: Receieved match prefix: 127.0.0 debug: Receieved mask: 32 debug: no match

debug: checking url: http://www.impexp-data.com%27;return debug: returning cached data : impexp-data.com.multi.surbl.org -> ARRAY(0x9386f58) debug: Receieved match prefix: 127.0.0 debug: Receieved mask: 32

As does SpamAssassin 3.0.1:

debug: URIDNSBL: query for indigisys.com took 0 seconds to look up (multi.surbl.org.:indigisys.com) debug: URIDNSBL: query for impexp-data.com took 0 seconds to look up (multi.surbl.org.:impexp-data.com)

Those are the only SURBL applications I have easy access to, so I don't know how others may handle them. SpamAssassin does the right thing. :-)

Jeff C. -- "If it appears in hams, then don't list it."

7769

Age (days ago)

7783

Last active (days ago)

discuss@lists.surbl.org

13 comments

8 participants

tags (0)

participants (8)

Cris Fuhrman
David B Funk
Frank Ellermann
Jeff Chan
Joe Wein
Jose Marcio Martins da Cruz
Robert Brooks
Steven Champeon