Hello again, I posted here once before I believe on the subject of URL shorteners. I am the operator author of the GPL software known as TightURL, and operator of the .com of the same name where there is a public installation of this software.
I was supposed to come back with some kind of response about checking URL submissions at submission time vs. re-checking later, but various things in life intervened, and I have yet to implement anything to track how many URLs I reject at submission time, but I am pleased to report that re-checking previously accepted URLs has clearly been a success. I am presently rechecking all my URLs every 6 hours *if* they meet the following conditions: * Is not already blocked in my database * Was added within a window period (7 days) -or- * Has recent activity and more than a threshold of hits recorded
Using this formula, I've been blocking URLs that were submitted over a year prior to blocking them, along with many recent submissions. I have not added the capability to unblock them later, largely because visual inspection of the blocked URLs gives me a pretty good feeling they'll be blocked again sooner or later. I may have to do something about that, but I'm not seeing the need right now.
Anyway,... today's question pertains to people who are mucking around in their DNS resolution, or have DNS providers that are falsifying DNS responses in order to generate ad income from typing errors in people's browsers. I recently concluded some support for a new user whose every URL submission appeared to be listed in both SURBL and URIBL, which my code checks to see if the URL should be rejected.
This user was somehow getting his own IP address returned to him any time the true response was really NXDOMAIN. This was some local thing on his end, but I presume the same problem exists if your DNS provider is tampering with the responses, except in that case you'd get some IP address from the public IP space that misdirects you to some "helpful" site if you happen to be using a web browser.
My workaround for this problem was to check to see if the first 4 characters of the response are "127." Every BL or URI BL I know about returns a response somewhere in the range of 127.0.0.0 - 127.0.0.255, unless I'm mistaken. While I guess this still leaves open the possibility for false-positives on my users end if they make use of both loopback network hosts and their DNS responses have been tampered with, what I wanted to know was if anyone knows of a BL or URI BL that returns or plans on returning something other than a loopback address as a positive response to a lookup?
- Ron