Hello again, I posted here once before I believe on the subject of URL
shorteners. I am the operator author of the GPL software known as
TightURL, and operator of the .com of the same name where there is a
public installation of this software.
I was supposed to come back with some kind of response about checking
URL submissions at submission time vs. re-checking later, but various
things in life intervened, and I have yet to implement anything to track
how many URLs I reject at submission time, but I am pleased to report
that re-checking previously accepted URLs has clearly been a success. I
am presently rechecking all my URLs every 6 hours *if* they meet the
following conditions:
* Is not already blocked in my database
* Was added within a window period (7 days) -or-
* Has recent activity and more than a threshold of hits recorded
Using this formula, I've been blocking URLs that were submitted over a
year prior to blocking them, along with many recent submissions. I have
not added the capability to unblock them later, largely because visual
inspection of the blocked URLs gives me a pretty good feeling they'll be
blocked again sooner or later. I may have to do something about that,
but I'm not seeing the need right now.
Anyway,... today's question pertains to people who are mucking around in
their DNS resolution, or have DNS providers that are falsifying DNS
responses in order to generate ad income from typing errors in people's
browsers. I recently concluded some support for a new user whose every
URL submission appeared to be listed in both SURBL and URIBL, which my
code checks to see if the URL should be rejected.
This user was somehow getting his own IP address returned to him any
time the true response was really NXDOMAIN. This was some local thing
on his end, but I presume the same problem exists if your DNS provider
is tampering with the responses, except in that case you'd get some IP
address from the public IP space that misdirects you to some "helpful"
site if you happen to be using a web browser.
My workaround for this problem was to check to see if the first 4
characters of the response are "127." Every BL or URI BL I know about
returns a response somewhere in the range of 127.0.0.0 - 127.0.0.255,
unless I'm mistaken. While I guess this still leaves open the
possibility for false-positives on my users end if they make use of
both loopback network hosts and their DNS responses have been tampered
with, what I wanted to know was if anyone knows of a BL or URI BL that
returns or plans on returning something other than a loopback address as
a positive response to a lookup?
- Ron