Randy Brukardt of rrsoftware.com mentioned that checking plain domains occurring in message bodies against SURBLs was pretty productive. (E.g., look for domain.com in addition to www.domain.com or http://www.domain.com).
Perhaps this could be something interesting to at least try experimentally or to think about.
Jeff C.
Jeff Chan wrote to SpamAssassin Developers:
Randy Brukardt of rrsoftware.com mentioned that checking plain domains occurring in message bodies against SURBLs was pretty productive. (E.g., look for domain.com in addition to www.domain.com or http://www.domain.com).
Perhaps this could be something interesting to at least try experimentally or to think about.
Yep. Good idea, overall. There are a few gotchas:
TLD extensions sometimes map file extensions. We might have to whitelist command.com, and the entire country of Poland. :-)
Looking at the above sentence, leading/trailing punctuation might be a potential snag. I.e.: 4 cheap pillz, go to somethingsleazy.com, and give us your money.
Since the domain is in plain text and doesn't contain a protocol or subdomain (i.e., 'www'), I haven't yet seen a mail client that will display it as a clickable URL. Thus, with this, we're probably mostly fighting the "type this in" or "cut and paste into your browser" type of spammer. SO, if we do this, implementers could force spammers to obfuscate the domains beyond recognition. They'll have to do their own munging, and we might try to catch it, but that's risky. "i looked on the boss' computer and found porn. info forthcoming...", or even, "spammer dot com operations are a plague on civilized nations".
Any implementations will probably have to run against large ham corpora to see if anything like the above becomes falsely *extracted* as a URI, regardless of whether the current data happens to cause a FP.
I'd advise keeping implementations simple and strict by default (i.e., no deobfuscation; maybe just clickable links only), and allow the user to control the amount of fuzziness they'd like to match on.
- Ryan
Hi!
Looking at the above sentence, leading/trailing punctuation might be a potential snag. I.e.: 4 cheap pillz, go to somethingsleazy.com, and give us your money.
Since the domain is in plain text and doesn't contain a protocol or subdomain (i.e., 'www'), I haven't yet seen a mail client that will display it as a clickable URL. Thus, with this, we're probably mostly
It was clickable in pine, at least...
Bye, Raymond.
Raymond Dijkxhoorn wrote to SURBL Discussion list:
Hi!
Since the domain is in plain text and doesn't contain a protocol or subdomain (i.e., 'www'), I haven't yet seen a mail client that will display it as a clickable URL. Thus, with this, we're probably mostly
It was clickable in pine, at least...
Really? How? Do you mean double-click to highlight? :-)
My (UNIX) PINE, when running in an x-term (i.e., locally), allows clicking on anchor'd URIs in HTML parts, and then it just calls VIEWER, if defined. I've yet to see it highlight anything in plain text. Remote PINE doesn't support any sort of mouse input.
What kind of PINE are you using? Looks like PINE 4.61 compiled for linux, from your headers.
- Ryan
Hi!
Since the domain is in plain text and doesn't contain a protocol or subdomain (i.e., 'www'), I haven't yet seen a mail client that will display it as a clickable URL. Thus, with this, we're probably mostly
It was clickable in pine, at least...
Really? How? Do you mean double-click to highlight? :-)
Just clickable, like any other URL inside pine...
View selected URL "http://www'" ?
My (UNIX) PINE, when running in an x-term (i.e., locally), allows clicking on anchor'd URIs in HTML parts, and then it just calls VIEWER, if defined. I've yet to see it highlight anything in plain text. Remote PINE doesn't support any sort of mouse input.
What kind of PINE are you using? Looks like PINE 4.61 compiled for linux, from your headers.
Yups, 4.61, linux.
Mouse input isnt really needed, you can use tab ;)
Bye, Raymond.
Raymond Dijkxhoorn wrote to SURBL Discussion list:
Hi!
Since the domain is in plain text and doesn't contain a protocol or subdomain (i.e., 'www'), I haven't yet seen a mail client that will display it as a clickable URL. Thus, with this, we're probably mostly
It was clickable in pine, at least...
Really? How? Do you mean double-click to highlight? :-)
Just clickable, like any other URL inside pine...
View selected URL "http://www'" ?
ROFLMAO. OK, I was able to reproduce this behaviour.
wwwhat?!
View selected URL "http://wwwhat?!" ?
Heh.
*That* one. You have enable-msg-view-web-hostnames enabled, and that highlights anything starting with 'www'. <-- including just "www'". :-) And, note the trailing single quote that PINE's viewer snags. www' isn't too exciting. Neither is wwwhat?!
So, www.domain.com, or wwwabbits.com would be considered URLs, but plain domain.com is still not considered a URL by my PINE, no matter what viewer preferences I set. Does that confirm what you see, too?
We probably don't need to take PINE's viewer too seriously in this context. :-)
Mouse input isnt really needed, you can use tab ;)
Sure. In this day and age, it's probably best to talk in non-input-specific terms, eh? :-)
- Ryan