If you're pretty confident your scripts can catch the spam domains with few false positive, I'd say go for it.
It should only 'detect' sites already available to (our) WS list ... I think that will hardly ever give FP's. In the beginning we could let the script only check (manually added) known spam-sites, this should prevent FP's even more...
Your methodology in general sounds good to me:
- Spam domain appears in traps.
- Web page is static and appears in other spams.
- Add to WS.
Perhaps you could publish your scripts for review, if you'd like that. :-)
Sure.... I would first like to see if anyone could think of a flaw in this method. If no one can think of anything I could try to code this the next couple of days. Will publish the result as soon as its available.
I think the most complicated part is the 'filtering usable (non hidden) url's out of received spam' part. I was thinking of reusing code designed by the SA crew. Has anyone tried that before ?
bye, Chris