----- Original Message ----- From: "Jeff Chan" jeffc@surbl.org <>
Hi Paul,
Thanks very much for sharing your data. Your results look about as should be expected for the other lists in terms of FPs and spam detection. Summarizing your numbers:
AB: 521886 spam 604 ham WS: 996200 spam 12578 ham JP: 1234602 spam 4376 ham OB: 1139181 spam 36760 ham SC: 751549 spam 1095 ham PH: 383 spam 1 ham
XS: 939134 spam 6283 ham XS unique: 10456 spam 5300 ham
For XS it looks like the Spam to Ham ratio is only about 2:1 which means it has too many FPs, and doesn't hit much unique spam, which is also reasonable given the lack of significant legitimate domain filtering and high inclusion threshold. We will work to improve those much further before we propose adding it to the production data in multi.
In terms of ratios of the current lists, OB is underperforming the others, judging by your data. I'm ccing Suresh at Outblaze so he can see the measurements you got.
All the lists need to hit less ham, and more aggressive checking and whitelisting is probably needed, assuming the data sources don't change their inclusion policies. I hope to address this in future.
Jeff C.
Happy to help Jeff.
Don't forget though that we have many custom SA rulesets, and thresholds that can be applied per mailbox, so my ham to spam ratio is not necessarily going to reflect a 'stock' install (this is ham or spam as defined by a variable threshold, not by someone actually identifying the message!).
Paul