Re: [SURBL-Discuss] probable impact of cid:.* urls in uri_to_domain

23 Apr 2004


      On Friday, April 23, 2004, 1:15:49 AM, Yusuf Goolamabbas wrote:
...
Hi, Currently URIDNSBL.pm uses SA's get_uri_list to get a list of URI's
from a message, the current regex seems to also get uri's of the form
cid:random_characters in the list
...
cid:.* seems to refer to content-ids,attachments in the same message
when these uris are run through uri_to_domain, they return back the same
result cid:.*
...
My feeling is that a message can contain some artificial cid:.* url's
which may skew the set of random domains used for SURBL lookup's
...
I am not sure if cid:.* url's should be returned from get_uri_list() or
they should be stripped correctly in uri_to_domain. Quite a few of the
values after cid: seem to refer to host names/domain names
I'll leave a detailed response to those more familiar with
URIDNSBL internals, but the goal is to remove all but the
base domain before comparing it to an SURBL.  So I'm hoping
any deliberately randomized characters and any other extra
stuff is discarded before RBL comparison.  Only the basic
domain should be checked against the SURBL.
Jeff C.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [SURBL-Discuss] probable impact of cid:.* urls in uri_to_domain