On Sunday, April 11, 2004, 6:52:56 AM, William Stearns wrote:
On Sat, 10 Apr 2004, Jeff Chan wrote:
It's so simple that I might be tempted to call it elegant:
- Resolve the incoming spam domains from SC into A and
perhaps NS records.
NS is a little tougher. I could see us hurting people that left
their name service with their registrar or large ISP's that simply host a lot of domains. It would also be possible for a spammer to simply claim that ns1.earthlink.net and ns2.aol.com are their secondary and tertiary when they aren't. It's not possible - or at least tougher - to fake A records.
Indeed, that's a strong argument for using A records only. I really appreciate your feedback!
- Keep a persistent tally counting those IPs. (a history)
- For As or NSes of incoming domains that match many identical
or nearby IP tallies (i.e., the new domains use known bad old IPs), drop their inclusion thresholds in some statistically cool and relevant way.
Update: I'm thinking of storing class C sized bins for the tallies. That's very quick and gets "nearness" automatically. (In other words, any IPs in the same /24 could be counted together initially.) How does that sound? Do I lose much by that deliberate imprecision? How much do the spammers move IPs? Would numerical nearness matter/help in detecting them?
The approach made sense to me as well.
http://www.stearns.org/sa-blacklist/spamip.current.txt is the report created from the A record harvesting. I'd be glad to provide the raw data collected over the past 5 months. I also have SOA and whois data for the entire sa-blacklist.
Thanks for the reference! Checking the list of IPs will probably answer some of my questions about numbers above, though I kind of want to get not-invented-here :-) in terms of the engine operation so I can prove the merits of using the SC data alone.
That said, I will check out the IPs!
Jeff C.