Catherine Hampton of SpamBouncer (welcome to the SURBL Discuss list Catherine!) is kindly making available her carefully checked phishing domains and IPs for our inclusion in the SURBL phishing list. They're not currently added to ph.surbl.org, but the hooks are in place to make it live after some discussion here.
Catherine's data come from antiphishing.org plus her own trapped phishes. All are hand checked about once a day. When I reviewed a recent snapshot of the data:
http://www.spambouncer.org/dist/standalone/phishdata/current.txt
I found that 124 of the 193 domains were already listed on various SURBLs. The other new 69 looked quite phishy and probably ok to list.
For the IPs, we had 22 of the 74 listed, and I'll assume the others are probably zombies, etc. as Catherine suggested. Generally speaking there's little harm in listing IPs since most legitimate sites don't get referenced by IP, so there's good upside and little downside for listing them.
Please take a look at the data for yourself and comment.
Regarding expiring the data, Catherine told me:
I expire "Phish IP" listings every month. Phishers move around a LOT, probably because most of the IPs are on compromised or trojaned hosts and tend to get fixed within a couple of weeks.
I don't expire Phish domains formally right now, although eventually I plan to run them through regular "has this domain expired and not been renewed" checks. Since I only list domains designed specifically for phishing and used only by phishers as "Phish domains", they aren't likely to be used for anything else. (Domains like paypalll.com don't seem to have much legitimate use to me.)
which sound like reasonable policies to me.
Does anyone have comments on adding these to the PH list?
Am I forgetting anything Catherine? :-)
Jeff C. -- Don't harm innocent bystanders.
It seems like this would be a hard thing to do by IPs. If you were to use Clamav and the Spamassassin hook (see wiki for it), you may get better near real-time phishing protection. That is what I do here any way. I give Clamav a 100 score. That's my 2 cents anyway.
-----Original Message----- From: Jeff Chan [mailto:jeffc@surbl.org] Sent: Saturday, July 30, 2005 10:23 PM To: SURBL Discuss; SpamAssassin Users; SpamAssassin Developers Subject: RFC: Adding SpamBouncer phishing data to ph.surbl.org
Catherine Hampton of SpamBouncer (welcome to the SURBL Discuss list Catherine!) is kindly making available her carefully checked phishing domains and IPs for our inclusion in the SURBL phishing list. They're not currently added to ph.surbl.org, but the hooks are in place to make it live after some discussion here.
Catherine's data come from antiphishing.org plus her own trapped phishes. All are hand checked about once a day. When I reviewed a recent snapshot of the data:
http://www.spambouncer.org/dist/standalone/phishdata/current.txt
I found that 124 of the 193 domains were already listed on various SURBLs. The other new 69 looked quite phishy and probably ok to list.
For the IPs, we had 22 of the 74 listed, and I'll assume the others are probably zombies, etc. as Catherine suggested. Generally speaking there's little harm in listing IPs since most legitimate sites don't get referenced by IP, so there's good upside and little downside for listing them.
Please take a look at the data for yourself and comment.
Regarding expiring the data, Catherine told me:
I expire "Phish IP" listings every month. Phishers move around a LOT, probably because most of the IPs are on compromised or trojaned hosts and tend to get fixed within a couple of weeks.
I don't expire Phish domains formally right now, although eventually I plan to run them through regular "has this domain expired and not been renewed" checks. Since I only list domains designed specifically for phishing and used only by phishers as "Phish domains", they aren't likely to be used for anything else. (Domains like paypalll.com don't seem to have much legitimate use to me.)
which sound like reasonable policies to me.
Does anyone have comments on adding these to the PH list?
Am I forgetting anything Catherine? :-)
Jeff C. -- Don't harm innocent bystanders.
On Saturday, July 30, 2005, 11:47:40 PM, Greg Allen wrote:
It seems like this would be a hard thing to do by IPs. If you were to use Clamav and the Spamassassin hook (see wiki for it), you may get better near real-time phishing protection. That is what I do here any way. I give Clamav a 100 score. That's my 2 cents anyway.
Not exactly sure what you mean by "by IPs". SURBLs list whatever appears in spam message body URI (host portions). For most spams those are domain names, but for many phishes, they're IP addresses (i.e. http://1.2.3.4/). If they have IPs in them, we list the IPs. If they have domain names, we list the domain names.
ClamAV is designed to protect against viruses. While their anti-phishing function works well, phishes and spam are not viruses. They probably felt the need to do something because the phishing threat is pretty serious, or can be if people get tricked by them, but we've had a SURBL phishing list for about a year:
http://www.surbl.org/lists.html#ph
SURBLs are designed to check message body URIs, which is what spammers and phishers are usually trying to direct victims with, therefore our tool is a much better fit for the problem than a virus tool, IMO.
Jeff C.
ClamAV is designed to protect against viruses. While their anti-phishing function works well, phishes and spam are not viruses. They probably felt the need to do something because the phishing threat is pretty serious, or can be if people get tricked by them, but we've had a SURBL phishing list for about a year:
SURBLs are designed to check message body URIs, which is what spammers and phishers are usually trying to direct victims with, therefore our tool is a much better fit for the problem than a virus tool, IMO.
Whatever works most reliably is the best. (And that may be a combination.)
In ClamAV's case, they have designed it to catch some proportion of phish and an appeal to "ClamAV is designed..." to restrict it to some limited category just doesn't past muster -- it does what it was designed to do -- catch (most) virus and catch many phish.
Also, with a simple blacklist you don't have logic built in for things like people mentioning the URIBL on a list like this so recourse to whitelists, and the program logic of SpamAssassin or some other "meta evaulation" method.
Presumably -- now you have me interested so I am going to check -- ClamAV does more than a naive pattern match on the URI and apparently they even have (had) endless debates in the ClamAV newsgroups/lists on this subject.
It's sort of like Tastes Great -- Less Filling. Silly argument when what we really want is great taste without getting fat. <grin> (Or pick one: revolvers vs. automatics, Macs vs. PCs, blonds vs. redheads, etc....)
Whatever works -- works.
And by the way: I REALLY appreciate your SURBL lists and hard work even if I think other tools supplement and help make your stuff even better.
My security principles include (but are not limited to):
1) Stop as much as possible at the outer perimeter (earlier the better)
2) Defense in depth
For us, the virus scanning happens before the Spam tests; early is good.
-- Herb Martin
Jeff Chan wrote:
Catherine Hampton of SpamBouncer (welcome to the SURBL Discuss list Catherine!) is kindly making available her carefully checked phishing domains and IPs for our inclusion in the SURBL phishing list. They're not currently added to ph.surbl.org, but the hooks are in place to make it live after some discussion here.
Outstanding. I get a ton of phishes. The SURBL checks I already use (primarily the SpamCop and Spamhaus SBL/XBL checks IIRC) catch most of the other crap I get.
The other thing I'd love to figure out is how to reliably tag all the 419 scams I tend to receive.
Jeff, if you can make this work, I owe both you and Catherine a keg of beer. :)
On Tuesday, August 2, 2005, 12:13:41 AM, Steve Sobol wrote:
Jeff Chan wrote:
Catherine Hampton of SpamBouncer (welcome to the SURBL Discuss list Catherine!) is kindly making available her carefully checked phishing domains and IPs for our inclusion in the SURBL phishing list. They're not currently added to ph.surbl.org, but the hooks are in place to make it live after some discussion here.
Outstanding. I get a ton of phishes. The SURBL checks I already use (primarily the SpamCop and Spamhaus SBL/XBL checks IIRC) catch most of the other crap I get.
Thanks for the feedback. We're working with a number of data sources to improve the phishing hit rate.
The other thing I'd love to figure out is how to reliably tag all the 419 scams I tend to receive.
Jeff, if you can make this work, I owe both you and Catherine a keg of beer. :)
I'm told that the latest 3.1.X SpamAssassin catches those pesky stock spams quite well. Let's hope they have 419s caught too.
But unless they advertise their own domains, SURBLs can't really catch either of those types of spam.
Jeff C. -- Don't harm innocent bystanders.
Outstanding. I get a ton of phishes. The SURBL checks I already use (primarily the SpamCop and Spamhaus SBL/XBL checks IIRC) catch most of the other crap I get.
SURBLs do tend to get the phish domains and IPs listed quickly, and Jeff's extremely strict "No false positives" standards have done a pretty decent job of keeping out domains belonging to innocent bystanders and (a trickier matter) domains belonging to servers that were hacked/trojaned/0wn3D and then used to host a phish site. That doesn't catch all phishes, of course, but it catches a good many of them.
The SpamBouncer filters catch a lot of new phishes, because of my set of "Phish Target" filters. These filters check for email claiming to be from a company targeted by phishers (like Ebay, Paypal, Washington Mutual Bank, etc.) to see whether it really came from there. If it isn't, it tags it, "Phish Target/ Forged Origin", and then my spamtrap puts it in a file of probable phishes that weren't caught by the "Phish Domains", "Phish IPs" or "Phish URLs" filters.
So I usually update my phish recipes pretty quickly. It seemed a shane not to share that data more widely.
The other thing I'd love to figure out is how to reliably tag all the 419 scams I tend to receive.
SpamBouncer doesn't catch them all, but it catches most of them. Want a couple of Procmail recipes for this? I don't think, however, that SURBLs will be much help with 419 spam because most of it doesn't use a domain or IP that belongs to the spammer/419er. Most of it uses free email sites and phone numbers for contacts.
Jeff, if you can make this work, I owe both you and Catherine a keg of beer. :)
Diet coke for me, please, but I'll happily accept. ;)
on Tue, Aug 02, 2005 at 12:13:41AM -0700, Steve Sobol wrote:
Jeff Chan wrote:
Catherine Hampton of SpamBouncer (welcome to the SURBL Discuss list Catherine!) is kindly making available her carefully checked phishing domains and IPs for our inclusion in the SURBL phishing list. They're not currently added to ph.surbl.org, but the hooks are in place to make it live after some discussion here.
Outstanding. I get a ton of phishes. The SURBL checks I already use (primarily the SpamCop and Spamhaus SBL/XBL checks IIRC) catch most of the other crap I get.
The other thing I'd love to figure out is how to reliably tag all the 419 scams I tend to receive.
Oh, 419/aff stuff is easy. It's all so consistent. It's one of the only content-oriented procmail rules I use:
# 419 :0 B * (I am|My name is) ((D|M)(R|r)s?.|the manager|barrister|Engr|Tony|Emmanuel|Prince) { SPAM419=yes }
:0 B * You may be surprised to receive this letter from me { SPAM419=yes }
:0 B * (My name is SENATOR|Our present situation have made us to send you|With great pleasure I,|We are pleased to inform you|Congratulations to you as we|This is to inform you of the release|next-of-kin|next of kin|urgent response|urgent reply|compliments of the day) { SPAM419=yes }
:0 * (LOTTERY PROMOTION|LOTTO|drew the lucky numbers|international winner) { SPAM419=yes }
Then later on:
:0 f * SPAM419 ?? yes | formail -A"X-Confirmed-Spam: $MSG419"
Caught about 69 of 115 AFF/419 spams that made it through my filters; on another box with a worse problem it caught 808 of 2220. So, not great, but better than nothing. And to be fair, I have 419 scams from before I added the procmail recipe, so I am not sure about the distribution of the successes. I don't remember getting a 419 scam lately that wasn't tagged, though.
Also, be sure you're checking for "helimore" patterns - one of the ratware packages they use generates a random HELO a la:
(123|2mails|ab[0-9]+c|abc|adplist|afzhg|ameinfo|azhg|bol|caramail|cookbe|coolde|coolgoose|coolre|coxlde|csiitb|cta|di-ve|dontbleftout|dontmissthis|emailwinnersclub|emarketmail|emzitd|emztd|eurosom|fastermail|fe[0-9]+son|fredrickanderson|fsmail|fubared|gawab|galmail|healthinsurance|helimore|hellrimore|heloimoex|heloimore|heythere|hotmail|imel|indxi|internationallotto|joininonit|juno|justice|laposte|latinmail|lawyer|lchost|libero|localhst|loclhst|lottery|lycos|madrid|madridspain|mail2world|mmail|mrson|msn|mxcson|netsape|netscae|netscpe|netscape|n2now|navar|nst2now|nut2now|ok|okey|okgy|okzy|omonmail|onemails|once|onmo|onmp|personal|phatomemail|qfgf|rdxx|rediffmail|rmk|sender|simbamail|sina|slickwebs|softice|somyingdd|spain|spinfinder|survey-pay|taylorsfamily|tellx|telstra|test|thaiservice|tiscali|tom|totalmail|twomails|visitmail|voila|vtomo|web-mail|whipmail|winning|wwinf|yahoo|yehey|z6|zwallet)[0-9]+.(biz|com)
A newer variant generates the HELO from the sender From: address, a la
# From: Income 4You income4u@pc4me.us # HELO: pc4me893.com
The second-level domain part in the From: is prepended to a random numeric and then .com; that's a very reliable test as well. (Though not limited to 419/AFF scams - I've seen other spammers use it, too).
I use a geographic crossreference (IP netblock to ISO country code) and check the Received: and X-Originating-IP: headers for the injection point and refuse if the point of origin is one of:
(africa|AR|BF|BG|BJ|BW|CI|CY|DK|ES|GH|IL|KE|KR|LB|LV|ML|MR|MY|NG|NL|RW|SN|TG|ZA|ZW)
...where 'africa' refers to a few blocks registered to US firms that proxy mail out of Africa (africaonline.com, IIRC).
Finally, I simply block a whole slew of hosts (mostly European freemail providers) whose headers don't include the injection point. 340 of them at last count. And 296 "legit" hosts are marked as "419 sources", so I quarantine any mail from them. 63 of those are hotmail.com hosts...
Oh, and I'm testing a rule that will refuse mail from hotmail.com hosts that think the point of injection was a hotmail IP (brokenness that an inside source has confirmed is broken but won't be fixed any time soon). IIRC, all of this mail comes in via some NAT interface or something, but I'm light on details.