[SURBL-Discuss] probable impact of cid:.* urls in uri_to_domain

Yusuf Goolamabbas yusufg at outblaze.com
Fri Apr 23 17:15:49 CEST 2004

Hi, Currently URIDNSBL.pm uses SA's get_uri_list to get a list of URI's
from a message, the current regex seems to also get uri's of the form
cid:random_characters in the list

cid:.* seems to refer to content-ids,attachments in the same message
when these uris are run through uri_to_domain, they return back the same
result cid:.*

My feeling is that a message can contain some artificial cid:.* url's
which may skew the set of random domains used for SURBL lookup's

I am not sure if cid:.* url's should be returned from get_uri_list() or
they should be stripped correctly in uri_to_domain. Quite a few of the
values after cid: seem to refer to host names/domain names

Regards, Yusuf

More information about the Discuss mailing list