On Wednesday, September 15, 2004, 7:06:34 PM, David Hooton wrote:
On Wed, 15 Sep 2004 16:43:32 -0700, Jeff Chan jeffc@surbl.org wrote:
we thought it might be useful to make the PJ data available as a separate list, at least within multi.surbl.org, the combined SURBL. We'd like to get your comments on this.
I think having a separate list makes sense if the data quality is different to that of the pooled data it was previously connected to.
We're also wondering whether the PJ data should be taken out of WS, or left in, if we do make PJ a distinct list.
No point in lowering the hitrate of the superset, any additional score added to a spam is better than none at all.
Please comment,
The greater choice and control we provide SURBL users the better. If we have the ability to sustainably break data out like this and provide ongoing data quality ratings to aid score adjustments I think we should do it.
Thanks for your feedback David. Does anyone else have comments about the possibility of PJ? Making separate lists from the WS data is a little different from the direction we've been going lately, so it would be nice to get comments on it. We're still somewhat undecided about whether to do it or not....
As you can see from the first message about this, the FP rates of PJ look significantly lower than WS as a whole.
Jeff C.