Hi Rob
Alain:
I like your brainstorming because you just might come up with an idea or two that I haven't thought of before... so don't let me slow you down too much....
Well reading a year list msg's gives some idea's ;-) (Well I did skip most of the reported FP uri's msg's).
However, do note that there are some limitations to the "if on more than one list, probably not an FP" philosophy.
I'm well aware of that. At the moment I'm mostly asking for info about it. It could be that taken "at least 2" or "at least 3" catches a very high % of the spammsg's (% against all lists "OR'ed"), so that this combination is still performing very well. Given the low nr of FP's on the seperate lists, even decreasing this a little bit would give a big boost.
This philosophy works great if the potential FP was due to "stupid human error" from one guy regarding one list. However, there are a number of scenarios where a mass mailing spam campaign may trigger a URI to get listed on multiple SURBL lists, even if that particular URI is found in ham. Of course, we do all that we can to minimize this possibility... and I believe that we are doing a great job and we are continually getting better...
I think you're already doing a very good job weeding out FP's. I'm aware that there could be conditions where a FP goes into more than one list. I'm also confident that those FP's are reported faster and thus solved faster too.
But, again, there are diminishing returns to the "if found in multiple SURBL lists, less change of FP" idea. This is true to an extent, but **not** always true.
Given that it's for me rather easy to implement a "scoring" combination from the different lists and that this is easy to configure. (I suspect most end-users will understand that and that it's easy to "publish" new recommended weigth's.) I think "at least 2" could be the default, without generating FP's (or almost none). The main filtering app that I write a plugin for (spampal) has nice whitelisting, but this needs a few weeks use before being really active.
PS. I hope I was clear, it's getting late here.
Alain