It's time to return to the question of a combined SURBL list again mainly because David Hooton's anti-phishing list is now ready. The list is too small to be a separate list at a little over a hundred entries, but it will probably grow. So I'd like to make it part of a combined list.
1. We had discussed two strategies for a combined list A records before:
I. Separate A records:
spammer.com IN A 127.0.0.1 IN A 127.0.0.2
Where the two addresses indicate it being on two lists.
II. Bitmasked address:
spammer.com IN A 127.0.0.3
where .3 means it's on the two lists corresponding to the .1 and .2 bits, and similarly for other lists in other bit positions.
In the first case resolution on spammer.com.combined-list-name-here.surbl.org would give two separate Addresses 127.0.0.1 and 127.0.0.2 and in the second case it would give one result 127.0.0.3. The querying program would need to act accordingly. For the bitmasked case some SA code would need to be written or reused. As mentioned earlier, various other RBLs combine lists using either of these two strategies. Here's an example cited earlier of the bitmask style:
Using the DNSBL
In opm.blitzed.org, the A record has an IP address of 127.1.0.x where x is a bitmask of the types of proxy that have been reported to be running on the host. The values of the bitmask are as follows:
WinGate 1 SOCKS 2 HTTP CONNECT 4 Router 8 HTTP POST 16
So the code using a combined list could be made to detect specific results, i.e., the specific list which triggered a matching A record could be determined, and not just that it matched "all" or any from the original list. On the other hand, the fact that matches from any list occurred may be good enough for some users. Personally I prefer more a detailed explanation that would come from being able to distinguish the source list, but that's a question of the program implementation and not the combined list itself. The combined list itself would always encode the source list, whether the querying program knew or cared how to decode that or not.
2. Another question would be the name of the combined list. Since there would be three or more lists, someone had suggested a name of "all" before. That sounds good to me unless there are other suggestions.
3. I'm assuming TXT records are no longer really feasible in a combined list and that descriptive messages will need to be signalled by the list (127. address) matched. I suppose it would be possible to create custom TXT records for every entry, but a generic TXT (or perhaps none) might be more likely. Is a generic TXT better than none? Even in a BIND file, where it incurs some use of space?
4. TTLs: If an entry has matches on more than one list, should it get a unique TTL? If so, should such a custom TTL on the multiply-matching entry be the longest TTL or the shortest TTL? I lean towards the inheriting the shortest TTL from the matching source list, plus setting a default TTL for the combined zone file to be near the longest.
Am I right in thinking that TTLs are largely irrelevant for rbldnsd, since it reloads zone info whenever the files change? In other words, does the rbldnsd cache clear for a given zone when the zone reloads, or do the cached entries with TTLs longer than the last reload interval remain in the cache? (I'm kind of hoping for the simpler, case of rbldnsd clearing whenever reloading.)
5. We will likely want to combine the ws and be lists into a single entry in a combined list, probably using the .1 bit for both of them, since both lists contain the enumerated (non-wildcarded) domains from SA regular expressions. Also, things are moving towards combining the non-wildcarded domains sa-blacklist and BigEvil/MidEvil, so this would somewhat short-circuit that process and future-proof things.
Comments?
Jeff C.
At 11:02 13/05/2004, you wrote:
- We will likely want to combine the ws and be lists into a
single entry in a combined list, probably using the .1 bit for both of them, since both lists contain the enumerated (non-wildcarded) domains from SA regular expressions. Also, things are moving towards combining the non-wildcarded domains sa-blacklist and BigEvil/MidEvil, so this would somewhat short-circuit that process and future-proof things.
Is it necessarily a good idea to combine lists like ws and be into a single entity when the sources of information are different ? (One comes from Bill, one comes from Chris) What policies of inclusion and removal do they each have ?
Say that a legitimate domain were somehow blocked, how would an end user know if it was Bill's data or Chris's that actually had it listed, to try and get it removed ? Etc...
So from a technical point of view, fine no problem, but I wonder a bit about compatibility of listing policies etc..
Regards, Simon
Hi!
Is it necessarily a good idea to combine lists like ws and be into a single entity when the sources of information are different ? (One comes from Bill, one comes from Chris) What policies of inclusion and removal do they each have ?
Say that a legitimate domain were somehow blocked, how would an end user know if it was Bill's data or Chris's that actually had it listed, to try and get it removed ? Etc...
You get a different answer from DNS... pretty clear :)
So from a technical point of view, fine no problem, but I wonder a bit about compatibility of listing policies etc..
Its only as a combined list to avoid double lookups for people who check with all 3 available lists, as most do currently. Saves 2/3 of the load on the nameservers :)
Bye, Raymond.
On Wednesday, May 12, 2004, 4:21:04 PM, Raymond Dijkxhoorn wrote:
Is it necessarily a good idea to combine lists like ws and be into a single entity when the sources of information are different ? (One comes from Bill, one comes from Chris) What policies of inclusion and removal do they each have ?
Say that a legitimate domain were somehow blocked, how would an end user know if it was Bill's data or Chris's that actually had it listed, to try and get it removed ? Etc...
You get a different answer from DNS... pretty clear :)
So from a technical point of view, fine no problem, but I wonder a bit about compatibility of listing policies etc..
Its only as a combined list to avoid double lookups for people who check with all 3 available lists, as most do currently. Saves 2/3 of the load on the nameservers :)
Actually I was getting tricky and proposing to collapse ws and be into a single response within a combined list. This was mainly to prevent needing to remove separate be entries later since it will probably be merged into ws eventually. I was proposing short circuiting that process in the combined list.
Jeff C.
Hi!
Its only as a combined list to avoid double lookups for people who check with all 3 available lists, as most do currently. Saves 2/3 of the load on the nameservers :)
Actually I was getting tricky and proposing to collapse ws and be into a single response within a combined list. This was mainly to prevent needing to remove separate be entries later since it will probably be merged into ws eventually. I was proposing short circuiting that process in the combined list.
I would really wanna go for the seprate answers so people know what they get back as answer. All RBLs do it like that. That way the end user also has a clear vieuw of whats going on.
If we stop announcing one of the lists we need to work anyway to get things 'stopped' on end user side.
Bye, Raymnd.
On Wednesday, May 12, 2004, 4:12:07 PM, Simon Byrnand wrote:
At 11:02 13/05/2004, you wrote:
- We will likely want to combine the ws and be lists into a
single entry in a combined list, probably using the .1 bit for both of them, since both lists contain the enumerated (non-wildcarded) domains from SA regular expressions. Also, things are moving towards combining the non-wildcarded domains sa-blacklist and BigEvil/MidEvil, so this would somewhat short-circuit that process and future-proof things.
Is it necessarily a good idea to combine lists like ws and be into a single entity when the sources of information are different ? (One comes from Bill, one comes from Chris) What policies of inclusion and removal do they each have ?
Say that a legitimate domain were somehow blocked, how would an end user know if it was Bill's data or Chris's that actually had it listed, to try and get it removed ? Etc...
So from a technical point of view, fine no problem, but I wonder a bit about compatibility of listing policies etc..
Yes, the reason for it is that the be list will probably come under the policies of the ws list eventually, as they are planned (behind the scenes) to be merged together. This would hasten the process, at least in the combined list. As individual lists, that merging may happen later.
Note that this is only referring to the enumerable, non-wildcarded domains from both. The wildcarded and impractical-to-enumerated domains from both may find their way into a combined regular experssion SA ruleset, probably a revamped BigEvil or somesuch.
Jeff C.
On Wed, 12 May 2004 16:02:33 -0700, Jeff Chan wrote:
- Another question would be the name of the combined list. Since
there would be three or more lists, someone had suggested a name of "all" before. That sounds good to me unless there are other suggestions.
This could potentially lead to confusion if you subsequently add another list not included in the "all" for some reason. That may seem unlikely now, but who knows? How about "multi"?
John.
-----Original Message----- From: discuss-bounces@lists.surbl.org [mailto:discuss- bounces@lists.surbl.org] On Behalf Of John Wilcock Sent: Thursday, 13 May 2004 4:06 PM To: Jeff Chan; SURBL Discussion list Subject: Re: [SURBL-Discuss]
On Wed, 12 May 2004 16:02:33 -0700, Jeff Chan wrote:
- Another question would be the name of the combined list. Since
there would be three or more lists, someone had suggested a name of "all" before. That sounds good to me unless there are other suggestions.
This could potentially lead to confusion if you subsequently add another list not included in the "all" for some reason. That may seem unlikely now, but who knows? How about "multi"?
How about surbl.surbl.org? Being the primary rbl, this kinda makes sense :)
Also on the other points about how to respond - I suggest the different IP per list as being smartest the octet based response is too non-specific. This only leaves the question on how do we handle multiple listings :)
Cheers!!
Dave
======================================================================== Pain free spam & virus protection by: www.mailsecurity.net.au Forward undetected SPAM to: spam@mailsecurity.net.au ========================================================================
Hi!
How about surbl.surbl.org? Being the primary rbl, this kinda makes sense :)
Also on the other points about how to respond - I suggest the different IP per list as being smartest the octet based response is too non-specific. This only leaves the question on how do we handle multiple listings :)
We give back multiple IPs, super simple :)
Bye, Raymond.
On Thursday, May 13, 2004, 1:01:01 AM, David Hooton wrote:
How about surbl.surbl.org? Being the primary rbl, this kinda makes sense :)
Also on the other points about how to respond - I suggest the different IP per list as being smartest the octet based response is too non-specific.
Both ways are specific. The multiple A record response would give a separate address for each corresponding list. The bitmasked response has a distinct bit set in the returned address for each list. A 3 means bits 1 and 2 are set, corresponding to those the lists in the 1 and 2 positions. Its true that the bitmask approach gives only 1 query result, even for inclusion in multiple lists though.
This only leaves the question on how do we handle multiple listings :)
I assume you're referring to how the client handles the multiple or bitmasked responses.
Jeff C.
-----Original Message----- From: discuss-bounces@lists.surbl.org [mailto:discuss- bounces@lists.surbl.org] On Behalf Of Jeff Chan Sent: Thursday, 13 May 2004 7:01 PM To: 'SURBL Discussion list' Subject: Re: [SURBL-Discuss]
This only leaves the question on how do we handle multiple listings :)
I assume you're referring to how the client handles the multiple or bitmasked responses.
Yes :) Have the SA plugins been written with this in mind or is this a mod that will have to be made?
Cheers!
Dave
======================================================================== Pain free spam & virus protection by: www.mailsecurity.net.au Forward undetected SPAM to: spam@mailsecurity.net.au ========================================================================
This only leaves the question on how do we handle multiple listings :)
I assume you're referring to how the client handles the multiple or bitmasked responses.
Yes :) Have the SA plugins been written with this in mind or is this a mod that will have to be made?
We brought up the subject earlier with the SA developers and the code already exists for use with other similar bitmasked lists, but it would need to be incorporated into the code using SURBLs. SA 3.0 can handle either type of combined list and they are ready for us. Actually they are waiting for us to make a decision so they can code for it before 3.0 gets released next month or so.
SpaCopURI would also need to be updated to handle the multiple list type however.
Jeff C.
Based on comments received so far, the following is proposed for a combined SURBL list:
Name: mutli.surbl.org
The sc and ws lists and a phishing list would be combined into a single, bitmasked SURBL mutli.surbl.org. Bitmasking means that there will only be one entry per spam URI domain name or IP address, but that entry will have an IP address that resolves according to which lists it belongs to:
1 = comes from sc.surbl.org 2 = comes from ws.surbl.org (and be.surbl.org) 4 = comes from phishing list
Where if an entry belongs to one of the lists it will have an address where the last octet has that value, for example 127.0.0.4 means it comes from the phishing list and 127.0.0.1 means it's in the data used in sc.surbl.org. An entry on multiple lists gets the sum of those list numbers as the last octet, so 127.0.0.3 means an entry is on both ws.surbl.org and sc.surbl.org. In this way membership in multiple lists is encoded into a single response.
Default TTL for the combined list is generally the longest of the included lists, which is six hours, while individual entries inherit the shortest TTL which can be 10 minutes for sc data. That allows individual entries to expire in BIND appropriately to their data source.
TXT message for each entry is generic, pointing to a page describing the different lists and their data sources.
All this is still open to discussion, but lets lock in the bitmasking scheme, unless there are any strong objections, so that the SA programs can start to be written or modified to use a combined list.
A combined list would be in addition to the individual lists, which would continue to exist.
Comments anyone?
Jeff C.
Jeff Chan wrote:
1 = comes from sc.surbl.org 2 = comes from ws.surbl.org (and be.surbl.org) 4 = comes from phishing list
Nice, although I'd prefer 2, 4, 8, etc. without 1. 127.0.0.1 pops up under the strangest conditions, it's better to leave it alone. I vaguely recall a case when 127.0.0.1 was on the SCBL and parts of the internet stopped to work ;-) Bye, Frank
On Friday, May 14, 2004, 8:06:46 PM, Frank Ellermann wrote:
Jeff Chan wrote:
1 = comes from sc.surbl.org 2 = comes from ws.surbl.org (and be.surbl.org) 4 = comes from phishing list
Nice, although I'd prefer 2, 4, 8, etc. without 1. 127.0.0.1 pops up under the strangest conditions, it's better to leave it alone. I vaguely recall a case when 127.0.0.1 was on the SCBL and parts of the internet stopped to work ;-) Bye, Frank
Thanks Frank, In principle, this shouldn't be an issue since SURBLs should only used on message bodies and not headers*, but perhaps we should change to 2,4,8 to be safe and like other RBLs.
* If 127.0.0.1 appeared on a regular message-header-parsing RBL, then I could see how that could potentially break things since the loopback address can show up as a hop in mail processing. (Numeric RBLs often start at .2 probably for that reason.)
Does anyone have any additional comments on this question?
Jeff C.
As Mark Ackerman pointed out, the new list name should be "multi" not mutli.... (that should be spelled motley or muttly I suppose. LOL! ;-)
Jeff C.