On Thursday, September 2, 2004, 5:43:26 PM, Loren Wilton wrote:
Given the lack of commonality, it may not make much sense to add to the mail spam lists, since it would be an extra 2000+ records that would probably not get hits on mail.
The MT-Blacklist doesn't seem to update too frequently (the last new record was from 8/29) and has about 2000 records. Matthew's list was pretty sparse so far. So I'm still pondering things.
Just from a technical/philosophical point, I think a separate list is desirable. Although I agree that making it part of multi would probably be the way to go, and I agree with the basic concept that "spam is spam".
However, I think the reasons for a separate list are:
- Separate source feed. A new list allows the source feed to be more
easily documented. 2. (As stated) little overlap with email spammers, at least so far. 3. Probably a different update cycle and removal (from old age) cycle requirement
The different means of updating and possibly different aging method are high on my list of reasons for suggesting a separate list. On the other hand, having it part of multi would be nice, since (I assume, possibly incorrectly) that one query could check a lot of lists based on the bitmap.
Correct. I'm still wavering if a blog spam list should be part of multi. There are programs that use multi but (unadvisedly) don't differentiate between the source lists. That kind of argues for keeping multi focussed on only mail spam and making a blog spam list separate. On the other hand there's much less overhead in adding a list internally to multi than setting up a whole new list.
It probably would also be good to devote some thought to how entries will be added to this list and validated. We surely don't want some annoyed blog spammer spamming the list with every valid doamin they can find!
Yes, data quality is always an issue. Any of these ventures will struggle if spammers are able to poison the data. Keeping legitimate domains out of any feed is key and provisions would need to be made for that.
Jeff C.