Brian Ipsen wrote:
Hi,
Well - to make things easy, I guess it's just a matter of checking whether any text is present from the <A HREF=""> to the </A> ... So <A HREF="http://domain.org/"></A> won't trigger anything - but <A HREF="http://domain.org/">Some text</A> will....
<a href=http://domain.org/><!--sometext--></a> shall not trigger
<a href=http://domain.org/><img src=toto.jpg width=0 heigth=0></a> shall not trigger
<!-- <a href=http://domain.org/>text</a> -->, maybe shall not trigger - shall check !!!
It seems to me very difficult to handle URL BLs without any manual handling. What you can do is to have some scripts to extract URLs and do many checks in order to present them in a easy way to handle it manually.
This kind of example is presented this way by my scripts.
# 461 1 7 0.292 4.167 14.286 : .. bangor.com # 461 1 7 0.292 4.167 14.286 : .. hankel.com # 461 18 7 5.250 75.000 257.143 : BL mainstreamsoft.biz # 461 1 7 0.292 4.167 14.286 : .. marmalade.com # 461 1 7 0.292 4.167 14.286 : .. monolith.com # 461 1 7 0.292 4.167 14.286 : .. sao.com # 461 1 7 0.292 4.167 14.286 : .. shiplap.com
This is a short example - only seven URLs. Usually when the number of URLs is greater, you have two or three URLs to blaklist.
Just my 5 cents of input ;-)
Also my 0.5 cents... 8-)
Joe
/Brian