[ST #592239]:postmaster: Re: [SURBL-Discuss] FP in OB ?

Fri Oct 15 13:56:53 CEST 2004

[forwarding my reply to Tony at Outblaze with his permission]

On Thursday, October 14, 2004, 8:34:41 PM, Tony RT wrote:
> [jeffc at surbl.org - Thu Oct 14 14:00:25 2004]:
>> Thanks Tony.
>> 
>> May I suggest that you consider checking a domain before
>> listing it?  Just because a few customers consider it spam
>> doesn't necessarily mean other customers might not want to
>> get it.  I ask because there seem to be some legitimate
>> sites getting onto your lists which some customers may
>> legitimately want to get.  For example none of the recent
>> FPs have had to do with pills, mortgages, warez, etc.
>> 
>> Another recent example is browsehappy.com run by the Web
>> Standards Project:
>> 
>>   http://webstandards.org/act/campaign/happy/
>> 
>> which seem pretty unlikely to be professional or even
>> casual spammers, no matter what users may report.  Users
>> are sometimes wrong, so data should be checked, IMO.
>> 
>> Jeff C.

> Jeff,  browsehappy.com problem was reported back to us by schampeon and 
> immediately removed.

> The approach we take (if its new and appears in reported spam) does have FPs, I 
> agree - but we havent been able to find a good way to "check".

> We do look (cursory) at all the blocked domains per day and if anything obvious 
> shows up we do remove them.  The problem is that detailed looking by a human is 
> not really practical given the volume of domains we block by day.

> As you know/see, we are very responsive and remove very quickly.

> If you have any suggestions on how to improve the process, I'm all ears and 
> will implement your suggestions as long as it doesnt consume too much human 
> time (checking 100s of domains 1 by 1 is just not practical).

> Cheers,
> TB

Hi Tony,
Thanks indeed for your responsiveness in removing FPs, and
addressing the concerns of us and your users.  Regarding
some checks that can be done on the incoming data, many
of the suggestions in our draft policy for manual lists
can be automated:

  http://www.surbl.org/policy.html

and some of those may perhaps be useful for your checking of
incoming suspected spam domains.  What I'd suggest is perhaps
using these to score new domains and to flag ones that rise
above a certain score.

For example, any domain in SBL probably can be blacklisted
immediately.  Any domain not in SBL probably begins to add
to a ham score, though not 100%.  If you have access to the
headers, and the senders are in xbl.spamuahs.org, then the
domain should probably be listed.  Any sender IP not in XBL
probably should get ham points.  Any domain with few or
zero NANAS hits may be hammy,  Domains in DMOZ, Wikipedia,
etc should perhaps get ham points since it's unlikely the
human editors of those would add or allow spam domains, etc.

Obviously most of the spam domains we get are fully spammy.
Perhaps some of these metrics can help flag ones that
are less spammy and worthy of a little further checking?

Your feedback, comments, questions, etc would be welcomed
since we intend to use a policy like this for our own
manual list, ws.surbl.org.  We may adopt other parts of
this for our automated lists also.

Cheers,

Jeff C.

P.S. Do you mind if I publish this response on our
SURBL discussion list?
--
"If it appears in hams, then don't list it."