I'm adding a very brief section to the Quick Start of the SURBL site:
Implementation guidelines
Here are some very brief instructions for folks writing software to use SURBL lists: You code should:
- Extract URIs from message bodies
- (Extraction of URIs from message bodies should ideally
include full resolution of redirections into the final target domain name. This can be a non-trivial problem.)
- Extract base (registrar) domains from those URIs
- Not do name resolution on the domains
- Look up the domain name in the SURBL by prepending it to
the name of the SURBL, e.g., domainundertest.com.sc.surbl.org then doing Address record DNS resolution. A non-result indicates lack of inclusion in the list. A result of 127.0.0.2 represents inclusion.
- Handle numeric IPs in URIs similarly, but reverse the
octet ordering before comparison against the RBL. This is standard practice for RBLs. For example, http://1.2.3.4/ is checked as 4.3.2.1.sc.surbl.org
SURBL lists unusually have both names and numbers in the same list. For example, 2.0.0.127 and example.com and similar actual spam domains and addresses are both in all SURBL lists. Numbered addresses in SURBLs are to have occurred as numbers in spams.
Can anyone think of any additions, corrections, or suggestions before I announce it more broadly?
Jeff C.