[SURBL-Discuss] FP rate
Jeff Chan
jeffc at surbl.org
Thu Dec 9 21:07:31 CET 2004
On Thursday, December 9, 2004, 11:12:35 AM, Theo Dinter wrote:
> On Thu, Dec 09, 2004 at 10:43:01AM -0800, Jeff Chan wrote:
>> I should have mentioned that these data are from 8 September.
>> The current rates are probably slightly+ different.
> My latest results, btw:
> OVERALL% SPAM% HAM% S/O RANK SCORE NAME
> 119502 106956 12546 0.895 0.00 0.00 (all messages)
> 70.083 78.3023 0.0080 1.000 1.00 0.00 URIBL_JP_SURBL
> 66.425 74.2146 0.0159 1.000 0.99 0.00 URIBL_OB_SURBL
> 71.986 80.4265 0.0319 1.000 0.99 0.00 URIBL_WS_SURBL
> 22.178 24.7793 0.0000 1.000 0.97 0.00 URIBL_SC_SURBL
> 15.251 17.0397 0.0000 1.000 0.96 0.00 URIBL_AB_SURBL
> 0.018 0.0178 0.0239 0.426 0.46 0.00 URIBL_PH_SURBL
A couple things perhaps worth adding:
1. The SC and AB spam detection rates would likely be closer
to the 70% range if the spam corpus were restricted to the
same time periods as the SC and AB data of 3 and 7 days
respectively.
2. Theo's ham corpus is a subset of the collective
SpamAssassin ham corpus, so the FPs for different
populations may be different. Relative differences
between FP rates are meaningful within this corpus.
Jeff C.
--
"If it appears in hams, then don't list it."
More information about the Discuss
mailing list