Jeff Chan wrote:
- Spammers use randomized subdomains on many levels above the third or fourth. It would be impossible and also meaningless in many cases to try to capture all of those levels, given the common randomization.
ACK. At the moment we're discussing one level above the base.
- It doesn't focus on what we're trying to go after: the many freshly registered "disposable" spam domains.
The idea is to identify spam by spamvertized URLs. For some time subdomains of free hosters were very popular. That's why I recognize something like "tripod.cl" or "wanadoo.es".
- If a hosting company is legitimate, they will kick out any spammers using subdomains under their parent domain.
Some hosters needed a clue by four. Did I mention tripod.cl ? Or terra.es ? At the moment new domains are state of the art (if spamming is an art), but that will change.
[joke-of-the-domain spam]
Yes, collateral damage is easily avoided. Don't list them.
That _is_ a collateral damage for the recipients of this spam, those who never solicited it and don't want it. If you refuse to list spammers only because some other users might exist who want this crap, then you hurt all users who don't want it.
And vice versa. In that conflict of interests it's not the job of SURBL to protect spammers, but to protect the victims.
Should we ***block everyone else's use*** of the Joke of the day domain?
If this joke-of-the-day is reported often enough via SpamCop as spam, then it should be listed in SC.surbl.org. Otherwise you would censor the SC input data for personal reasons, and that would be wrong.
You should only play god if you're absolutely sure that SC and the SC users screwed up (and this will happen, the spammers try it again and again). SC is only a script, it can't think.
Remember, the goal is to include domains that *only appear in spams*, and to exclude domains that appear in hams. I think that's very clear and simple, not at all obscure. :-)
The goal for SC.surbl.org is to list spamvertized domains, and to identify spam based on the listed domains. It's perfectly neutral, not "some users really want a mortgage from this bank" or similar excuses.
If we include every domain that anyone has ever considered spam, our data will be too full of false positives for other people to use it.
That's why you have technical rules for the SC input data, it's not "anyone", but substantiated facts reflecting SC reports.
It would be a lie if you exclude spamvertized domains for only personal reasons. Sometimes "legit" companies really are so stupid to spamvertize their own domain directly, and then they should be listed if the required number of SC users says so.
Bye, Frank