At 20:38 2004-09-30 -0400, Rob McEwen wrote:
I understand that you don't want domains to be whitelisted solely on the basis of their web site traffic if they really shouldn't be whitelisted. I've addressed this particular concern at least twice now, and it still seems as though you didn't actually read that part of my posts?
Yes I did.
I just have a major problem with this approach in general.
Extensive whitelistings will never solve the real problem - too many FPs in the input. Using traffic/size/etc-data as input for decisions on which sites to whitelist solves even less. It might minimize some of the more obvious FPs, but I'm actually more worried about the FPs that are not so obvious - the smaller sites and companies that are not on any "largest/most visited/etc" lists and don't get noticed immediately, that will linger on as FPs until someone who happens to hand check a message recognize the domain as legit. The real big ones will show up and be whitelisted quite quickly anyway. But they are just a small percentage of the actual FPs that I encounter. Most are smaller or non-US sites that didn't immediately ring a bell and will still not ring a bell regardless of how many "big company/lots of traffic/etc" lists we use as whitelisting sources.
Arguing that "if it eliminates FP X, Y days earlier than it got whitelisted anyway, it's worth the effort" doesn't cut it. What gets done is limited by resources and focus. If efforts and focus go into another whitelist source, less efforts will go into something else that might be more worthwhile. Like, in my opinion, making the initial listings trackable.
I also have a particular problem or two with using Alexa.
Alexa produce dubious data using dubious methods. I think associating with them is a bad idea, and using anything below their top 50 as an indication that a domain is legit and non-spammy will produce a new set of bad data instead of cleaning up the initial one.
Patrik