David Hooton wrote:
snip
How does anyone else feel about turning this list of sex sites into a SURBL?
Jeff C.
+1 for me. Keeping it a seperate list is a great idea. Gives admins the choice. Especially ISPs who may only wish to tag spam, but allow customers to look at the occasional naughty boom boom kissy kissy :)
--Chris
Has anyone got any FP stats on this data while using it in Squidguard?
It looks like very useful data, but how is it managed?
Could be very intresting data to have a trial of at least.
Fabrice Prigent of the University of Toulouse maintains the database and told me that he has an automated mechanism to verify contributed domains and that he verifies contributed domains himself in case of any doubt.
I am a contributor to the database and the weekly scan for adult sites produces anything between 500 and 5000 domains per week. The set of scripts that I wrote have been tuned for 18 months and I have stopped verifying the list of domains that it produces, since I have not seen false positives for a long time. The scripts use a scoring method and by checking the medium score domains I usually get a bunch of false negatives (adult sites not rated as adult by the scripts) that are also contributed to the database.
in short: I believe the quality is very high and new versions can be downloaded daily by ftp.
off topic: those of you who have a mother language other than English or Dutch, has computing power (1 recent Intel CPU), Unix and bandwidth (1 mbit) can receive the scripts to add your mother language and find adult sites in this language.
-Marc