[SURBL-Discuss] FP rate

Jeff Chan jeffc at surbl.org
Thu Dec 9 21:07:31 CET 2004


On Thursday, December 9, 2004, 11:12:35 AM, Theo Dinter wrote:
> On Thu, Dec 09, 2004 at 10:43:01AM -0800, Jeff Chan wrote:
>> I should have mentioned that these data are from 8 September.
>> The current rates are probably slightly+ different.

> My latest results, btw:

> OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
>  119502   106956    12546    0.895   0.00    0.00  (all messages)
>  70.083  78.3023   0.0080    1.000   1.00    0.00  URIBL_JP_SURBL
>  66.425  74.2146   0.0159    1.000   0.99    0.00  URIBL_OB_SURBL
>  71.986  80.4265   0.0319    1.000   0.99    0.00  URIBL_WS_SURBL
>  22.178  24.7793   0.0000    1.000   0.97    0.00  URIBL_SC_SURBL
>  15.251  17.0397   0.0000    1.000   0.96    0.00  URIBL_AB_SURBL
>   0.018   0.0178   0.0239    0.426   0.46    0.00  URIBL_PH_SURBL

A couple things perhaps worth adding:

1.  The SC and AB spam detection rates would likely be closer
to the 70% range if the spam corpus were restricted to the
same time periods as the SC and AB data of 3 and 7 days
respectively.

2.  Theo's ham corpus is a subset of the collective
SpamAssassin ham corpus, so the FPs for different
populations may be different.  Relative differences
between FP rates are meaningful within this corpus.

Jeff C.
--
"If it appears in hams, then don't list it."



More information about the Discuss mailing list