Hand classification / Scripted submission - Discuss

9 Jul 2004


      Hi All,
'Tis my first post to this list... so I'll try to make it a good one.
I have a heap (and will continue to have future heaps) of spam with URIs
that don't hit any of the SURBLs. We'll hand-classify the URIs, of
course, but are there any objections to scripting the submission against
http://www.rulesemporium.com/cgi-bin/uribl.cgi?report=1 ?
The basic idea would be to run one automatic pass to analyze our SA
headers to see if *_URI_BL rules already matched; if they don't, add the
message to a processing queue. Then, for each URI in the each message of
that queue, do the lookup again with Net::DNS, since the URI might have
been added in the last 24 hours or so since the spam was received. This
should hopefully produce a comparatively short list of URIs (and spams)
that don't appear in the SURBL. From here, we can hand-classify this
bunch (i.e., delete any that aren't spammer sites), and have our
second-pass script strip the SA markup (if present) from the spam, and
automatically submit the hand-picked URIs and their spams via the web
interface.
Questions:
1) Is this approach reasonable (i.e., am I going to hear screams from
   someone if I script this, assuming I take precautions, rate-limit the
   submissions, and check the results before turning it loose?)
2) Is there already a more efficient way to submit URIs? (Besides
   running my own list, which, I guess, isn't too unreasonable :-)
3) Is there any advantage to submitting the same URI more than once
   (i.e., from different spam messages?) It seems like the answer is
   probably "no", but I'll gladly accept enlightenment.
4) Should I be submitting to multiple SURBLs, or just stick with
   ws.surbl.org?
Since implementing SURBLs in SA2.63 about a week ago, we've had amazing
success. So much that we're having occasional word-wrap issues with the
X-Spam-Level: (stars) header. :-)
Now I want to give something back.
- Ryan
-- 
  Ryan Thompson ryan@sasknow.com

  SaskNow Technologies - http://www.sasknow.com
  901-1st Avenue North - Saskatoon, SK - S7K 1Y4

        Tel: 306-664-3600   Fax: 306-244-7037   Saskatoon
  Toll-Free: 877-727-5669     (877-SASKNOW)     North America