[SURBL-Discuss] Re: Possible large whitelist from DMOZ data

7 Oct 2004


      On Wednesday, October 6, 2004, 11:26:33 AM, Daniel Quinlan wrote:
...
I would not suggest using either to whitelist automatically, but if you
get several of these sources and count the number of hits for each
domain, then you should be able to prioritize and possibly automatically
whitelist the ones that hit in a large number of databases.
Let us know if you think of any others.  dmoz and wikipedia
hadn't occurred to me before.
Can anyone think of any other large, hand-built or checked
directories or databases of (legitimate) URIs?
Is it possible to pull URIs out of semantic webs?
...
I would also take snapshots, but for a different reason than the one
Jeff suggested.  I would take snapshots and take the intersection of two
snapshots for each source (two separate days of DMOZ, etc.) as the
authoritative list since some spammer links (especially if added by some
bot) will drop off once they are found.
Those are all good ideas.  Do you know if spammer links do get
deleted?  How do the folks who maintain the sites find abusers
or bots?
Jeff C.
--
"If it appears in hams, then don't list it."

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

[SURBL-Discuss] Re: Possible large whitelist from DMOZ data