Hello again, I posted here once before I believe on the subject of URL
shorteners. I am the operator author of the GPL software known as
TightURL, and operator of the .com of the same name where there is a
public installation of this software.
I was supposed to come back with some kind of response about checking
URL submissions at submission time vs. re-checking later, but various
things in life intervened, and I have yet to implement anything to track
how many URLs I reject at …
[View More]submission time, but I am pleased to report
that re-checking previously accepted URLs has clearly been a success. I
am presently rechecking all my URLs every 6 hours *if* they meet the
following conditions:
* Is not already blocked in my database
* Was added within a window period (7 days) -or-
* Has recent activity and more than a threshold of hits recorded
Using this formula, I've been blocking URLs that were submitted over a
year prior to blocking them, along with many recent submissions. I have
not added the capability to unblock them later, largely because visual
inspection of the blocked URLs gives me a pretty good feeling they'll be
blocked again sooner or later. I may have to do something about that,
but I'm not seeing the need right now.
Anyway,... today's question pertains to people who are mucking around in
their DNS resolution, or have DNS providers that are falsifying DNS
responses in order to generate ad income from typing errors in people's
browsers. I recently concluded some support for a new user whose every
URL submission appeared to be listed in both SURBL and URIBL, which my
code checks to see if the URL should be rejected.
This user was somehow getting his own IP address returned to him any
time the true response was really NXDOMAIN. This was some local thing
on his end, but I presume the same problem exists if your DNS provider
is tampering with the responses, except in that case you'd get some IP
address from the public IP space that misdirects you to some "helpful"
site if you happen to be using a web browser.
My workaround for this problem was to check to see if the first 4
characters of the response are "127." Every BL or URI BL I know about
returns a response somewhere in the range of 127.0.0.0 - 127.0.0.255,
unless I'm mistaken. While I guess this still leaves open the
possibility for false-positives on my users end if they make use of
both loopback network hosts and their DNS responses have been tampered
with, what I wanted to know was if anyone knows of a BL or URI BL that
returns or plans on returning something other than a loopback address as
a positive response to a lookup?
- Ron
[View Less]
On Sunday, July 13, 2008, 9:12:25 AM, Joseph Brennan wrote:
> Jeff Chan <jeffc(a)surbl.org> wrote:
>> I think we
>> probably can't reveal the exact listing criteria in case they're
>> useful for the bad guys. I know it's somewhat inappropriate to
>> ask for comments without revealing details. I suppose I'm asking
>> for general responses then. :)
> So you'll keep ob, but take some undisclosed action to improve its
> accuracy. Sounds worthwhile …
[View More]to me.
Thanks! Yes, we would not get rid of OB entirely ever. It does
have some good data, but with too many FPs. The goal would be to
keep as much of the good data as possible while eliminating most
of the bad. Unfortunately some of the good data may be thrown
out with the bad; baby with the bathwater, so to speak. IMO FPs
are much worse than FNs, so some increase in FNs balances out a
decrease in FPs. Trying to decide if it's worth doing....
Jeff C.
[View Less]
On Saturday, July 12, 2008, 7:00:29 PM, Joseph Brennan wrote:
> --On Saturday, July 12, 2008 1:41 AM -0700 Jeff Chan <jeffc(a)surbl.org>
> wrote:
>> We can probably significantly reduce the false positives on
>> ob.surbl.org, the SURBL list based on Outblaze's URI blacklist:
>>
>> http://www.surbl.org/lists.html#ob
>>
>> at the cost of some possibly minor false negatives.
[...]
>> Should we make that change?
> This agrees with our …
[View More]experience using SURBL for a few years. We've
> seen the occasional fp with the ob listings, and none I am aware of
> with the rest of SURBL.
> What's the change?
Thanks much for the feedback Daryl and Joseph! I think we
probably can't reveal the exact listing criteria in case they're
useful for the bad guys. I know it's somewhat inappropriate to
ask for comments without revealing details. I suppose I'm asking
for general responses then. :)
Jeff C.
[View Less]
We can probably significantly reduce the false positives on
ob.surbl.org, the SURBL list based on Outblaze's URI blacklist:
http://www.surbl.org/lists.html#ob
at the cost of some possibly minor false negatives.
SpamAssassin has live (weekly?) statistics about the performance
of their rules, including all SURBL lists, against their ham and
spam corpora at their Rule QA site:
http://ruleqa.spamassassin.org/
As you can see OB ranks significantly below the other SURBL
lists, with much …
[View More]higher FP rates around 0.1% compared to 0.01% to
0.025% or so. (Note that the Rule QA site seems to have occasional
glitches, so if the numbers seem out of range one week, check
again later.)
Should we make that change?
Jeff C.
[View Less]