[SURBL-Discuss] Whitelist Please

Jeff Chan jeffc at surbl.org
Wed Sep 8 00:48:58 CEST 2004


On Tuesday, September 7, 2004, 9:10:50 AM, Frank Ellermann wrote:
> Jeff Chan wrote:

>> While I agree that these "spam to your friends with jokes,
>> greetings, prayers, whatever" sites are stupid and highly
>> abuse-prone, they do have some legitimate uses and should
>> probably not be blocked globally.

> IBTD.  You could split your whitelist into "Jeff found some
> potentially legitimate use" and "really innocent bystanders".

> The first white list should not be used to overrule SpamCop
> reports in sc.surbl.org.  Thousands of SC users have an idea
> why they report spam, and these ideas don't necessarily match
> your personal definition of "potentially legitimate use".

> Spam is about consent and not about "potentially legitimate
> use" or similar vague constructs.

Every form of spam classification can make errors.  Therefore
there must be some form of feedback or error correction, or
other strategies to deal with misclassifications.

Whitelisting is one strategy.

Another is trying to get enough spam reports or even trapped
spam to be able to get some meaningful statistical impression
about spammyness.  If 1000 people report a domain as spammy,
it probably is.  If only 1 person says it's spammy it may be
less likely.

It would be great to hear about other strategies.   Does
anyone have any ideas, research, etc. into this?

>> euniverse is either a spamhaus or not.

> It's not that simple.  We've already discussed this problem
> with the pyramid scheme "spamarrest.com", a spammer styling
> itself as "anti-spam".  IIRC they never made it as candidate
> for sc.surbl.org, the technical definition of spam works as
> expected.  It's unnecessary to add your personal definition
> of "potentially legitimate use" to sc.surbl.org if there is
> a way to catch obvious errors like BBC-links in 419 spam.

In grey cases, we must sometime apply some judgement in order
to prevent false positives.  It's not fun or easy, but it needs
to be done, or else SURBLs could rapidly become much less useful.

>> it seems odd to me that one part of their operation would
>> be somewhat responsible, and another part would be blatantly
>> spamming.

> Yes, that's odd.  But this shouldn't be your problem, it's
> their weird business model.  Please use the SC input as is,
> don't try to censor it.

The point is to determine whether the organization is a
spam gang or not.  I agree with your point that we should
be free to list any part of an organization that is mostly
spammy, however, even if other parts are not.

>> I place organizations that use their own mail servers in
>> a different class than those who are using zombies

> For the SC input there should be only two classes:  Obvious
> errors or votes as defined on your "I have a dream" page.

Perhaps my obvious errors are not the same as your obvious
errors.  ;-)

In case of disagreement we must whitelist or there is
potential for FPs.  When in doubt, we whitelist (or exclude
in the first place for manual blacklists).

> | we judge spam messages based on what they say, not where
> | they come from.

> There are no "rogue nations".  The average admin in China is
> like the average admin in Florida.  China is only bigger.

I assume most people are aware that many of the professional
spammer sites seem to be hosted in China, Brazil and Korea,
and that they continue to do so.  Therefore we can assume
any anti-spam laws or abuse policies are not being enforced
there.

That said, it *is* the content that matters.  Pill spammers,
mortgage spammers, warez spammers, porn spammers, etc. all can
be blocked, regardless of where they host or zombie.

> | More reports means more votes that a given site is indeed
> | spam. The quality of data is reinforced by the conscientious
> | efforts of good people in reporting the spam. In this sense
> | it is democracy in action.

> Nothing about "potentially legitimate use" on the SC data page.
> IMHO that's a feature and no bug.  Simply tune the technical
> definition of spam until it matches your ideas of "potentially
> legitimate use".  Manual interventions should be _exceptions_
> for the sc.surbl.org zone.  Less work for you, and prepared to
> run in unattended mode.
>                             Bye, Frank

They *are* the exceptions.  Most of the SpamCop reports
get into sc.surbl.org.

Jeff C.



More information about the Discuss mailing list