[SURBL-Discuss] RE: (1) Another Possible FP, and (2) header p arsing issues

Bill Landry billl at pointshare.com
Fri Aug 13 23:33:22 CEST 2004


----- Original Message ----- 
From: Rik van Riel

> Once I have a working script to extract URLs from a spamtrap
> feed, I'll make it available as free software.  Possibly even
> bundled with Spamikaze ;)

Here is a script I run against my spamtrap mailboxes to output a list of
domain names:

egrep -i "http|www" main.mbx | cut -d ":" -f2 | cut -b 3- | cut -d "/" -f1 |
sed "s/=2E/\./g" | grep "\..*\." | egrep -v " |>|=|@|\..*\..*\." | cut -d
"." -f2-3 | tr -d "<" | usort | uniq

Depending on your mailbox format, it may work for you as well.

Bill



More information about the Discuss mailing list