[SURBL-Discuss] general questions.....

Rob McEwen rob at powerviewsystems.com
Tue Nov 23 21:14:59 CET 2004


>Differences being tastes in
>the definition of the classification

...which reminds me... I keep meaning to ask about what constitutes a FP
when discussed on this list. Basically, this isn't always so black & white:

Consider the following classifications:

A. Definite hand-typed HAM

B. Closed Loop Opt-In NEWSLETTER (topically applicable to the recipient)

C. NEWSLETTER (topically applicable to the recipient) from reputable
organization (no harvesting, few/none NANAS, no SpamHaus) where the person
didn't actually subscribe, but likes to read it... maybe it came because
they previously bought something or left checked a "receive other
offers/info" checkbox

D. More "spammy" NEWSLETTER (but topically applicable to the recipient)
where the mailer is fairly "clean" (some NANAS, no SpamHaus), but the user
didn't explicitly Opt-in. Maybe they left a "receive other offers" checkbox
checked in the past when filling out something else or ordering something
else.

E. More "spammy" ADVERTISEMENT (but topically applicable to the recipient)
where the mailer is very "clean" (no harvesting, few NANAS, no SpamHaus),
but the user didn't explicitly Opt-in. Maybe they left a "receive other
offers" checkbox checked in the past when filling out something else or
ordering something else

F. Definite spam (to varying degrees).

(I'm sure someone else could have done a better job of listed
hard-to-differentiate categories)

Of course, it is not always possible to know if an e-mail is "topically
applicable to the recipient". But assuming that you do, it is hard for Mail
Administrators to distinguish between B, C, and D. It is also sometimes hard
to distinguish between E & F.

The overwhelming percentage of Spam IS very distinguishable from A-E because
of things like obfuscation techniques, SpamTrap recipients, location of
sender's server, past history of sender, etc.

Still, this whole issue makes me question, "how good are Ham Corpuses".

Moreover, when a particular SURBL gets an FP rating of .002%, I think,
"that's great"... but then I wonder, "is this .002% actual human written
correspondence, or is it a newsletter, etc?"

Rob McEwen




More information about the Discuss mailing list