On Friday, September 17, 2004, 6:46:56 AM, Chris Santerre wrote:
I do NOT like the idea of more lists.
- The lists are dynamic, so FP rates will change.
It's true that FP rates vary over time for all lists, but the FPs of PJ look consistently lower than WS.
- Too many lists make it more difficult for the devs to GA and perceptron
run all of them. Causing a slow down in scoring for SA and others.
While it's true that a PJ list would be one more rule for the SpamAssassin mass checks to score, I doubt that one more list would slow it down significantly in the larger picture. Mass checks are already scoring a gazillion other rules....
- Run a diff and find out where we have our FPs.
The diffs between WS and PJ are about 26k records out of 46k records, perhaps too many to check by hand. Or did you mean just the FPs?
- More lookups for mutli
multi doesn't work that way. We can have an infinite number of lists in multi (for the same overall universe of domains and IPs) and it's still just one lookup per wild URI. That's a major advantage of a combined list: one lookup gets you all the lists.
Remember that the PJ records are already in multi, as part of WS, so there would be no new records added by having PJ separate, just some changed return codes and some slightly longer TXT records with "[PJ]" added.
- Too many list options will drive some potential users away.
Most users probably just use the defaults. We would want to add PJ to the default configs for SA3, if we do it.
- K.I.S.S.
The only reason I see having more lists is if the data is specifically different throughout the whole list.
ie: phishing, UC, regular spam, blog, ect....
His list data is the same kind as WS. So really....why seperate?
sc, ws, ob and ab all have email spam URI data, but they're all separate lists because they represent different types of data sources (human reports, manual lists, filtered traps, etc.).
I actually wanted the JW data to be separate in the beginning because it was a distinctly different and new data source with different a inclusion process, different spamtrap feeds, etc.
We just keep getting our FP rate lower and it will all be good.
We definitely need to get the FPs in WS lower, independent of anything else. FPs only hurt WS and make it less useful to people.
Jeff C.