-----Original Message----- From: Jeff Chan [mailto:jeffc@surbl.org] Sent: Tuesday, October 19, 2004 9:26 PM To: SURBL Discussion list Subject: Re: [SURBL-Discuss] Possible new redir FP
On Tuesday, October 19, 2004, 5:17:20 PM, Joe Wein wrote:
get2.us
I saw that in our spam feed to but skipped blacklisting
after checking the
site, as it looks like a redirector.
Instead I sent them email reporting the abuse and pointed
out our open
letter to redirector sites.
Joe
Excellent! Thanks Joe.
I checked them out a little more and their site and name servers seem clean in terms of SBL, etc., including the parent domain ScriptWiz.com. The apparent owner of the sites seems to have several different sites, all pretty enterprising and apparently legitimate.
I'm going to go ahead and whitelist all the redirection sites mentioned at:
caught.us get2.us getto.us hasballs.com hated.us ismyidol.com spotted.us went2.us wentto.us
None were whitelisted before, and only get2.us was listed in a SURBL.
Feedback, good or bad, on any of these is still welcomed!
OK, hasballs.com is pretty funny ;)
--Chris
RE: FP Reduction Progress?
Jeff,
As we all know, there has been much work done in the past several weeks to remove those URIs in the SURBL lists which could cause FPs.
Is there any "measuring stick" to evaluate our progress? I recall about a month or two ago someone commenting on the list a test which showed an FP percentage which really seemed to disturbed you (Jeff). Do you recall that message? (I can't seem to find it). It would be great if someone could find that message and then see what results that same testing would show today.
Certainly, we are probably far from done... but it seems like this effort these past several weeks would have helped much by now?
Rob McEwen
On Wednesday, October 20, 2004, 9:22:27 AM, Rob McEwen wrote:
As we all know, there has been much work done in the past several weeks to remove those URIs in the SURBL lists which could cause FPs.
Is there any "measuring stick" to evaluate our progress? I recall about a month or two ago someone commenting on the list a test which showed an FP percentage which really seemed to disturbed you (Jeff). Do you recall that message? (I can't seem to find it). It would be great if someone could find that message and then see what results that same testing would show today.
Certainly, we are probably far from done... but it seems like this effort these past several weeks would have helped much by now?
We've definitely reduced the FPs, but we still need to keep checking the data to improve it further. For example, some of the DMOZ hits we found still need to be checked. Some of those are false positives and others would be false negatives if removed. Manual checking is tedious, but the best and perhaps only way to make that determination.
The list with the highest FP rate was WS at about 0.4%. Others are at least an order of magnitude (ten times) lower, which is a very significant difference. Everyone is welcomed and encouraged to help check the WS hits against DMOZ and thereby improve the usefulness and performance of SURBLs. Note that a few have already been whitelisted (which tools like GetURI will show):
http://spamcheck.freeapp.net/whitelists/check-ws-dmoz.txt
A good method is to check them against some of the proposed inclusion policies at:
http://www.surbl.org/policy.html
Ryan's GetURI and Dallas' SURBL + Checker are both useful tools that automate some of those checks:
http://ry.ca/cgi-bin/geturi.cgi http://www.rulesemporium.com/cgi-bin/uribl.cgi
Basically we're looking for domains that have legitimate (non-spam) uses or could reasonably be mentioned in hams. Those that have legitimate uses should not be listed.
As far as measurements, probably the most meaningful results come from a really comprehensive ham corpus, such as some of the ones used to test SpamAssassin. Another way is by looking at the rate of false positive reports, but those are probably too infrequent to be a useful measure. Another measurement is the number of hits against what should be mostly legitimate domains in things like DMOZ, Wikipedia, etc. Those came out similar in apparently similar proportions to the SA ham corpus results, for example the various list hit counts against DMOZ as of October 6 were:
4 dmoz-blocklist.ab 61 dmoz-blocklist.jp 165 dmoz-blocklist.ob 2 dmoz-blocklist.ph 8 dmoz-blocklist.sc 1141 dmoz-blocklist.ws 1381 total
As of today it looks like this:
4 dmoz-blocklist.ab 44 dmoz-blocklist.jp 26 dmoz-blocklist.ob 4 dmoz-blocklist.ph 3 dmoz-blocklist.sc 943 dmoz-blocklist.ws 1024 total
That's only one measure against the 2.3 million unique domains and IPs in DMOZ, but it's still a possible hint at the FP rates in the different lists.
To me the biggest surprise is the drop in OB hits. Outblaze folks, thanks much for working on that!
Jeff C. -- "If it appears in hams, then don't list it."
Jeff,
In reference to:
http://spamcheck.freeapp.net/whitelists/dmoz-blocklist.ab
I removed and whitelisted cafe24.com on my end. They seem to be a Korean free hoster that is being abused.
mynetmarketer.com hits several of my traps on a recurring basis and ignores all unsub requests. They have an abuse policy on their site and list terms and conditions, but they do not seem to enforce either. I plan to leave them listed as they seem dirty and I do not see any logical way they would end up in hams.
petsunlimited.com does not seem to be in my data and I do not recall removing it any time recently, so I'm not sure why it is showing up as an AB hit, but they should be whitelisted
careerslb.com seems to have valid uses, but they spam like crazy and do not honor removal requests. I guess they should be whitelisted on the SURBL end, but I plan to keep them in my data to track their volume.
-- Andy
On Wednesday, October 20, 2004, 11:24:16 AM, Andy Warner wrote:
Jeff,
In reference to:
I removed and whitelisted cafe24.com on my end. They seem to be a Korean free hoster that is being abused.
mynetmarketer.com hits several of my traps on a recurring basis and ignores all unsub requests. They have an abuse policy on their site and list terms and conditions, but they do not seem to enforce either. I plan to leave them listed as they seem dirty and I do not see any logical way they would end up in hams.
petsunlimited.com does not seem to be in my data and I do not recall removing it any time recently, so I'm not sure why it is showing up as an AB hit, but they should be whitelisted
careerslb.com seems to have valid uses, but they spam like crazy and do not honor removal requests. I guess they should be whitelisted on the SURBL end, but I plan to keep them in my data to track their volume.
Thanks for the research Andy! careerslb and petsunlimited clearly have potential legitimate uses so I'm whitelisting them. Neither is in SBL nor has any NANAS hits.
I've also whitelisted cafe24, but if we have anyone who can read Korean, I'd like to hear what they think the site is and whether it may have legitimate uses. Does anyone here have Korean-reading contacts? cafe24.com has about 1000 NANAS, is on SBL and has been registered since 1999. If they are a host or portal, they may have some significant abuse problems.
mynetmarketer.com is a lot less clear. Their NANAS hits look very spammy, yet they do have very strong sounding anti-spam policy.
http://www.mynetmarketer.com/spam.htm
They are not in SBL, have been registered since 2001, and have 212 NANAS hits. Does anyone know if they have any legitimate uses? I am not whitelisting them without more feedback.
Jeff C. -- "If it appears in hams, then don't list it."