Jeff Chan wrote to Jeff Chan:
OK I Updated the policy page, taking Ryan's top rules and general organizational comments:
http://www.surbl.org/policy.html
Please let me/us know what you think of it now.
Hi Jeff,
Aha! I like it very much. I suspect it will still evolve a bit--as most good things do--but it gets the point across, and also provides a lot of good, useful information that will assist human classifiers in listing (only) the spammiest domains.
On a related note, do we want to say anything in this document (or possibly another document) about whitelisting criteria? There are really three main categories:
1. Blacklist material (that's what your policy addresses very well)
1.5. "Almost" blacklist material (the grey ones); ala the "UC" list, are the domains that are almost totally spammers, but may have a few borderline uses
2. Domains that should not be listed, but are not necessarily of "whitelist" merit. These are mostly the domains where insufficient data (or effort) exists to make a determination, which, for good or for ill, is where the bulk of our human efforts are currently focused.
3. Domains that are white; i.e., have definite legitimate uses
OK, that's four. If we really want to reduce FPs, we need to carefully consider *all* of these categories when analysing potential domains. I spend just as much time pulling domains out of ham as I do pulling domains out of spam.
The distinction between 2 and 3 is almost as difficult as the distinction between 1 and 2 sometimes.
- Ryan