On Thursday, September 30, 2004, 12:51:32 PM, Rob McEwen wrote:
(1) There is going to be a very, very, very strong correlation of Alexa's rankings to what sites people are actually visiting most often. (Is there a more accurate list out there, anywhere?)
(2) I think that the top 20,000 alexa sites are going to have a very high probability of being domains which get mentioned in hams fairly often.
(3) This test I propose will probably find very, very few of hits on SURBL of sites in the first place. And, as I said, not all of this should be automatically removed from SURBL. I specifically said that these should be **double checked**... NOT automatically removed. You talk as if I suggested that these be automatically removed. I just said to double-check these.
(4) If we only find one or two domains which really should be removed, this could be of potentially great benefit toward reducing FPs. Expecially since smithbarney.com, an obvious candidate for whitelisting, was one the least-frequented sites on this list of 20,000.
If this results in significant reduction of FPs, then perhaps we should do it again with rankings alexa.com rankings 20k through 50k??
I agree with your theory that there is probably a strong correlation between commonly visited sites and those mentioned in hams.
However the point is moot if they won't give us a snapshot of the data. The will sell a feed of the data for money, but that's less interesting.
Jeff C. -- "If it appears in hams, then don't list it."