*snip*
At nearly 500k record we'd definitely want to do this with rbldnsd. BIND is already getting pretty slow with lists approaching 100k records. For comparison, sbl.spamhaus.org has about 5k records, xbl has about 1.7 million records, list.dsbl.org has nearly 4 million records. multi.surbl.org, our largest production list has about 45k records.
It keeps growning fast!
How does anyone else feel about turning this list of sex sites into a SURBL?
Jeff C.
+1 for me. Keeping it a seperate list is a great idea. Gives admins the choice. Especially ISPs who may only wish to tag spam, but allow customers to look at the occasional naughty boom boom kissy kissy :)
--Chris
On Mon, 19 Jul 2004 10:08:32 -0400, Chris Santerre csanterre@merchantsoverseas.com wrote:
*snip*
At nearly 500k record we'd definitely want to do this with rbldnsd. BIND is already getting pretty slow with lists approaching 100k records. For comparison, sbl.spamhaus.org has about 5k records, xbl has about 1.7 million records, list.dsbl.org has nearly 4 million records. multi.surbl.org, our largest production list has about 45k records.
It keeps growning fast!
How does anyone else feel about turning this list of sex sites into a SURBL?
Jeff C.
+1 for me. Keeping it a seperate list is a great idea. Gives admins the choice. Especially ISPs who may only wish to tag spam, but allow customers to look at the occasional naughty boom boom kissy kissy :)
--Chris
Has anyone got any FP stats on this data while using it in Squidguard?
It looks like very useful data, but how is it managed?
Could be very intresting data to have a trial of at least.
David Hooton wrote:
snip
How does anyone else feel about turning this list of sex sites into a SURBL?
Jeff C.
+1 for me. Keeping it a seperate list is a great idea. Gives admins the choice. Especially ISPs who may only wish to tag spam, but allow customers to look at the occasional naughty boom boom kissy kissy :)
--Chris
Has anyone got any FP stats on this data while using it in Squidguard?
It looks like very useful data, but how is it managed?
Could be very intresting data to have a trial of at least.
Fabrice Prigent of the University of Toulouse maintains the database and told me that he has an automated mechanism to verify contributed domains and that he verifies contributed domains himself in case of any doubt.
I am a contributor to the database and the weekly scan for adult sites produces anything between 500 and 5000 domains per week. The set of scripts that I wrote have been tuned for 18 months and I have stopped verifying the list of domains that it produces, since I have not seen false positives for a long time. The scripts use a scoring method and by checking the medium score domains I usually get a bunch of false negatives (adult sites not rated as adult by the scripts) that are also contributed to the database.
in short: I believe the quality is very high and new versions can be downloaded daily by ftp.
off topic: those of you who have a mother language other than English or Dutch, has computing power (1 recent Intel CPU), Unix and bandwidth (1 mbit) can receive the scripts to add your mother language and find adult sites in this language.
-Marc
I agree with the previous comment that the sex-site issue is really not a spam issue, but a content filtering issue. And since the purpose of SURBL is to stop spam, it's probably beyond it's scope.
However, if it was a quality free SURBL, we would likely use it.
Bret