This is an update from yesterday's post on urls which are not currently being parsed by sa in version 2.63
Further cases:
6. msn redirection services g.msn.com
workaround for PerMsgStatus.pm $uri =~ s/^http://g.msn.com/[^*]+?http:(.*)$/http:$1/g;
7. use of html escape sequences in the url http://toform.net/mcp/879/1352/cap112.html To translate these into the equivalent ascii characters, I have used HTML::entities rather than reinvent the wheel
workaround for PerMsgStatus.pm use HTML::Entities; $uri = HTML::Entities::decode($uri);
Here is a cumulative diff containing the workarounds for these and the previous cases. The diff is against PerMsgStatus.pm 2.63 already patched with SpamCopUri 0.09
Hopefully someone can include these in version 3 and more elegantly....
diff PerMsgStatus.pm.orig PerMsgStatus.pm ----cut------- 45a47
use HTML::Entities;
1777a1780,1789
dbg("Got URI: $uri"); $uri =~ s/\%68/h/g; $uri =~ s/\%74/t/g; $uri =~ s/\%70/p/g; $uri =~ s/http:\/([^\/])/http:\/\/$1/g; $uri =~ s/http:\/\/http:\/\//http:\/\//g; $uri =~ s/^http:\/\/(?:drs|rd).yahoo.com\/[^\*]+\*(.*)$/$1/g; $uri =~ s/^http:\/\/g.msn.com\/[^\*]+\?http\:(.*)$/http\:$1/g; $uri = HTML::Entities::decode($uri); dbg("URI after filter: $uri");
----cut-------
At 00:29 19/04/2004, John Fawcett wrote:
This is an update from yesterday's post on urls which are not currently being parsed by sa in version 2.63
Further cases:
- msn redirection services g.msn.com
workaround for PerMsgStatus.pm $uri =~ s/^http://g.msn.com/[^*]+?http:(.*)$/http:$1/g;
- use of html escape sequences in the url
http://toform.net/mcp/879/1352/cap112.html To translate these into the equivalent ascii characters, I have used HTML::entities rather than reinvent the wheel
workaround for PerMsgStatus.pm use HTML::Entities; $uri = HTML::Entities::decode($uri);
[snip]
Hi John,
Can you do a unified diff (instead of the kind you did) of your latest changes ? Also do you know if there is much difference between SpamCopURI 0.09 and 0.10 in the area you have changed ? (I'm using the latter)
Regards, Simon
----- Original Message ----- From: "Simon Byrnand"
Can you do a unified diff (instead of the kind you did) of your latest changes ? Also do you know if there is much difference between SpamCopURI 0.09 and 0.10 in the area you have changed ? (I'm using the latter)
Regards, Simon
See below for unified diff. The changes are to PerMsgStatus.pm a SA module rather than to SpamCopURI. However, the same file has already been changed during SpamCopURI install. The diff assumes the 0.09 install of SpamCopURI has already been done. I'll install 0.10 sometime in the next days and let you know if I find any incompatibilities.
John
--- PerMsgStatus.pm.orig 2004-04-20 08:46:05.000000000 +0200 +++ PerMsgStatus.pm 2004-04-18 14:24:52.000000000 +0200 @@ -44,6 +44,7 @@ use Mail::SpamAssassin::Conf; use Mail::SpamAssassin::Received; use Mail::SpamAssassin::Util; +use HTML::Entities;
use constant HAS_MIME_BASE64 => eval { require MIME::Base64; };
@@ -1776,6 +1777,16 @@ $uri = "${base_uri}$uri"; } } + dbg("Got URI: $uri"); + $uri =~ s/%68/h/g; + $uri =~ s/%74/t/g; + $uri =~ s/%70/p/g; + $uri =~ s/http:/([^/])/http://$1/g; + $uri =~ s/http://http:///http:///g; + $uri =~ s/^http://(?:drs|rd).yahoo.com/[^*]+*(.*)$/$1/g; + $uri =~ s/^http://g.msn.com/[^*]+?http:(.*)$/http:$1/g; + $uri = HTML::Entities::decode($uri); + dbg("URI after filter: $uri");
# warn("Got URI: $uri\n"); push @uris, $uri;