-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Jeff Chan writes:
On Sunday, August 22, 2004, 4:54:36 PM, Christiaan Besten wrote:
I think the most complicated part is the 'filtering usable (non hidden) url's out of received spam' part. I was thinking of reusing code designed by the SA crew. Has anyone tried that before ?
Have not tried it, but agree it's a good approach. Message and URI parsing from spams can be non-trivial.
I'd recommend:
1. a *really* simple SpamAssassin 3.0.0 plugin be written, that just dumps $scanner->get_uri_list() to STDOUT. (this is *really* easy. honest)
2. create a config file that loads that plugin and sets up a fake "rule" that runs it.
3. Then when you want to grab URLs from a spam mail, run "spamassassin -c configfile -L -t < msg" on it; to process a bigger batch of spam, use "mass-check -c configfile".
4. Profit!
if someone does this, please let me know how they find the doco, etc., and put a page up on the SpamAssassin wiki about it... I'm trying to encourage more plugins ;)
- --j.