We are going to remove the original 3k+ BigEvil domains from WS. We feel there may be too many FPs coming from it, as our methods of checking have gotten 5 million times better then what I used to do alone.
So rather then checking them all for FP, we simplt remove.
We feel this shouldn't cause any problems, and any domains still being used will be reported to us, and added back in.
Does anyone have any objections to this?
--Chris (Yankees suck.)
Chris Santerre wrote:
We are going to remove the original 3k+ BigEvil domains from WS. We feel there may be too many FPs coming from it, as our methods of checking have gotten 5 million times better then what I used to do alone.
So rather then checking them all for FP, we simplt remove.
We feel this shouldn't cause any problems, and any domains still being used will be reported to us, and added back in.
Does anyone have any objections to this?
Good plan...
Could I get a copy of the file? is it in spamgate? if yes, name?
thanks
Alex
On Friday, October 15, 2004, 8:06:53 AM, Alex Broens wrote:
Chris Santerre wrote:
We are going to remove the original 3k+ BigEvil domains from WS. We feel there may be too many FPs coming from it, as our methods of checking have gotten 5 million times better then what I used to do alone.
So rather then checking them all for FP, we simplt remove.
We feel this shouldn't cause any problems, and any domains still being used will be reported to us, and added back in.
Does anyone have any objections to this?
Good plan...
Could I get a copy of the file?
http://spamcheck.freeapp.net/bigevil-uniq
Jeff C. -- "If it appears in hams, then don't list it."
Jeff Chan wrote:
On Friday, October 15, 2004, 8:06:53 AM, Alex Broens wrote:
Chris Santerre wrote:
We are going to remove the original 3k+ BigEvil domains from WS. We feel there may be too many FPs coming from it, as our methods of checking have gotten 5 million times better then what I used to do alone.
So rather then checking them all for FP, we simplt remove.
We feel this shouldn't cause any problems, and any domains still being used will be reported to us, and added back in.
Does anyone have any objections to this?
Good plan...
Could I get a copy of the file?
Thanks Jeff
May use locally as being non-US the chances of an FP are minimal and will able to eliminate any FPs out without disturbing the world.
Alex
On Friday, October 15, 2004, 8:01:31 AM, Chris Santerre wrote:
We are going to remove the original 3k+ BigEvil domains from WS. We feel there may be too many FPs coming from it, as our methods of checking have gotten 5 million times better then what I used to do alone.
So rather then checking them all for FP, we simplt remove.
We feel this shouldn't cause any problems, and any domains still being used will be reported to us, and added back in.
Does anyone have any objections to this?
Thanks Chris, Since there were no objections, I have removed the ~3k old domains that are only in BigEvil from WS. This will not remove any that are in BigEvil and also in any other of the data sources feeding into WS.
Note that I have not whitelisted the old BigEvil domains. Instead the process is a "reverse join" where records that are only in the old BigEvil are removed from WS. Any domains or IPs that are are in any WS data sources other than BigEvil will continue to get listed. That includes any that were in BigEvil but get added to WS from another source. So the process is somewhat dynamic, and not a simple static removal or whitelisting. In other words this change won't prevent any domains in BigEvil from getting listed in WS if they get added by another WS data source.
Hopefully that will decrease the FPs in WS, though there is still work to do if we're serious about reducing FPs.
Jeff C. -- "If it appears in hams, then don't list it."
Jeff Chan wrote:
Hopefully that will decrease the FPs in WS, though there is still work to do if we're serious about reducing FPs.
I don't want to say too much (cause I don't really know what I'm talking about!), but back in the day of separate cf files, it wasn't bigevil.cf that gave me FPs, it was sa-blacklist.cf. When they were separate I didn't actually use sa-blacklist because of the FPs. Too be honest, I wasn't too pleased when bigevil started to exist only as the WS surbl because of the sa-blacklist FPs.
What I'm saying may be totally off base, but if you're looking for FPs to remove, how about checking the sa-blacklist stuff that's still in WS? Does that even make sense?
Daniel
On Friday, October 15, 2004, 10:13:41 PM, Daniel Kleinsinger wrote:
Jeff Chan wrote:
Hopefully that will decrease the FPs in WS, though there is still work to do if we're serious about reducing FPs.
I don't want to say too much (cause I don't really know what I'm talking about!), but back in the day of separate cf files, it wasn't bigevil.cf that gave me FPs, it was sa-blacklist.cf. When they were separate I didn't actually use sa-blacklist because of the FPs. Too be honest, I wasn't too pleased when bigevil started to exist only as the WS surbl because of the sa-blacklist FPs.
One reason for getting rid of BigEvil is that it was created with a different set of assumptions and attitudes towards what should be listed or not listed compared to the current ideas. There are quite a few data in WS that should also be removed due to being created under older policies.
That said, any overlap between the old BigEvil domains and the more recent WS sources will cause the overlap to be listed.
What I'm saying may be totally off base, but if you're looking for FPs to remove, how about checking the sa-blacklist stuff that's still in WS? Does that even make sense?
Yes, some of the other sources are probably larger sources of FPs than BigEvil. We've asked some of those other sources to check their data, particularly against the DMOZ hits, as many of those are probably FPs.
Jeff C. -- "If it appears in hams, then don't list it."