[SURBL-Discuss] Fwd: URI's not recognized

Menno van Bennekom mvbengro at xs4all.nl
Fri May 7 23:52:24 CEST 2004


Thanks John!
This looks like a straightforward change.
I'll try it on my testmailserver after the weekend (or sooner if I can't
control myself).

Regards
Menno

>  From: Menno van Bennekom <mvbengro at xs4all.nl>
>> At first redirects like this were not recognized:
>> http://rd.yahoo.com*http://spammer.spam.biz
>> So I removed ^ from the BIZ expression:
>> uri BIZ_TLD  /(?:https?:\/\/|mailto:)[^\/]+\.biz(?:\/|$)/i
>>
>> Still the following was not recognized:
>> <a href=3Dhttp://away.goingabroadd.biz/aps/cms/>
>> Because of the 3D (and other stuff spammers put there lately).
>> Only by changing 'uri BIZ_TLD' to 'body BIZ_TLD' it gets recognized.
>> But I use SpamCopURI and that also doesn't recognize URI's with things
>> in
>> front of http.
>> And I can't tell SpamCopURI to use the 'body' check instead or uri..
>> How can I make the URI subroutine recognize these URI's?
>> Would using SpamAssassin v3.0 help?
>
> Presumably it's not being picked up because http does not occur
> on a word boundary. I have a similar example which is picked up
> through SpamCopURI because the url is correctly enclosed in
> double quotes.
>
> <a href=3D"http://rd.yahoo.com/winery/college/banbury/*http:/len=
> derserv.com?partid=3Darlenders">
>
> In order to pick up non quoted urls preceded by quoted printable
> characters (like =3D) then a modification is required to the
> PerMsgStauts.pm spamassassin module, which doesn't
> currently decode quoted printable characters before checking
> for url patterns.
>
> If I add a call to MIME::QuotedPrint::decode_qp in get_uri_list
> then your example is correctly picked up.
>
> Here's a diff file of the changes I made to PerMsgStatus
> (which also deal with HTML encoded characters too and
> double http protocols).
>
> -----------------cut----------------------
> --- PerMsgStatus.pm.orig        2004-04-25 12:50:05.000000000 +0200
> +++ PerMsgStatus.pm     2004-05-07 14:33:55.000000000 +0200
> @@ -44,7 +44,8 @@
>  use Mail::SpamAssassin::Conf;
>  use Mail::SpamAssassin::Received;
>  use Mail::SpamAssassin::Util;
> -
> +use HTML::Entities;
> +use MIME::QuotedPrint;
>  use constant HAS


More information about the Discuss mailing list