On Friday, April 23, 2004, 3:37:34 PM, Jeff Chan wrote:
On Friday, April 23, 2004, 7:38:46 AM, Chris Santerre wrote:
This is where BigEvil may start going. I can change mine in 2 secs to use /\d00\dhosting/ but as soon as I do that, it will be removed from be.surbl.org. For obvious reasons they can't use wildcards. All signs point to me changing bigevil over to search for this kind of stuff, and simply add any static ones I have to ws.surbl.org. But will see.
This is where our philosophies clash slightly.
SURBLs just want a list of known spam domains.
SA rulesets with wildcards try to match entire possible/probable classes of domain names based on observing prior types of variation.
Both approaches have their merits.
For my purposes, I'd just prefer to get the domains that have already been found in spam. I acknowledge that that doesn't have the predictive value of the class approach, but it also makes FPs less possible in principle. (Though in reality it's not very likely that any legitimate sites are suddenly going to start using rxmeds1.com, rxmeds2.com, rxmeds3.com, etc.)
I should amend this: SURBLs don't care what domains are in them. be.surbl.org handles most of the wildcarded domains from BigEvil.cf and MidEvil.cf just fine.
It's only the more complex ones with fancier patterns than simple alternation that are not expanded into separate domain names by expand_regex.pl.
Still, in cases where the resulting patterns are too large to expand into all possible domains, I'd prefer to get a list of the actual reported ones, for use in be.surbl.org, instead of discarding them.
Jeff C.