[SURBL-Discuss] Weighted Reporting - was 'FPs in WS'

Jeff Chan jeffc at surbl.org
Mon Aug 23 03:10:07 CEST 2004


On Tuesday, August 17, 2004, 8:25:31 AM, Andy Warner wrote:
> The AbuseButler data seems to have had a fairly low FP rate in large
> part because it is based on weighted reporting. Only the most frequently
> reported domains make it onto the list. It isn't perfect and there have
> been some FPs (mainly on very popular brand name domains that are
> misreported and get past whitelisting). If other folks want to pass along
> their URI hits to help improve the volume ratings feel free to drop me a
> line. At the moment weighted SpamCop data is still the largest source
> of data, but private trap data volume is growing.

The data in sc.surbl.org is also weighted based on number of
reports.  (You and I came up with very similar solutions for
handling the SpamCop data.)

But the WS list source data does not always have this "spam
volume" data behind it.  In some cases, the source data are
just singular lists with no counts of how often they appeared
in spam.  So weighting is probably not available across all of
the WS data.

I agree it's a useful concept though. I think of it as a form
of voting

Jeff C.



More information about the Discuss mailing list