I found the following domains listed in a ~20K ham corpus from the last couple of days:
Domain Age in Days Score (#msgs) attac.biz 826 -3.3333 (1) blah.com 3462 -3.3333 (1) chartshop.com 2275 -3.3333 (1) publicaster.com 965 -3.3333 (1) resortvacationstogo.com 1849 -3.3333 (1) send4fun.com 1681 -30.0000 (1) surveymonkey.com 1770 -3.3333 (4) topcities.com 1926 -2.5000 (1) whtirc.com 246 -3.2491 (1) whtradio.com 196 -2.3630 (1)
Full GetURI output: http://ry.ca/geturi/runs/20040910-fps.html
Quick look:
attac.biz Looks real fishy, but appeared in some travel newsletter. Related to eturbonews.com. Maybe de-list, but don't whitelist.
blah.com This was in a ProFTPd mailing list message "how do I set up a virtual host for "blah.com".
chartshop.com Astrology.com newsletter. People really do sign up for those things.
surveymonkey.com AFAICT, mostly legit surveys
send4fun.com Jokes, example was person to person links
topcities.com Free subdomain host.
whtirc.com, whtradio.com Web Hosting Talk newsletter. Yes, it's legit.
publicaster.com Used in some legit newsletters/mass media
resortvacationstogo.com Looks mostly legit, and they've been around for 1800+ days without any NANAS hits at all. Related to vacationstogo.com.
With the exception of attac.biz, I'd say whitelist the lot of these, unless anyone knows some reason why not. :-)
The *real* cool part is GetURI (devel version) actually processed all of these messages in one run, producing ~4MB of output. Mozilla, on the other hand, crashed rather unceremoniously. Be thankful I re-ran on only the messages that generated the FPs. :-)
Also, if someone wants to go through the *other* domains (i.e., those on a grey (not blue) background), there are probably quite a few other whitelist candidates there. (gc.ca, for instance :-)
- Ryan
On Friday, September 10, 2004, 10:43:55 PM, Ryan Thompson wrote:
I found the following domains listed in a ~20K ham corpus from the last couple of days:
Domain Age in Days Score (#msgs) attac.biz 826 -3.3333 (1) blah.com 3462 -3.3333 (1) chartshop.com 2275 -3.3333 (1) publicaster.com 965 -3.3333 (1) resortvacationstogo.com 1849 -3.3333 (1) send4fun.com 1681 -30.0000 (1) surveymonkey.com 1770 -3.3333 (4) topcities.com 1926 -2.5000 (1) whtirc.com 246 -3.2491 (1) whtradio.com 196 -2.3630 (1)
Full GetURI output: http://ry.ca/geturi/runs/20040910-fps.html
Quick look:
attac.biz Looks real fishy, but appeared in some travel newsletter. Related to eturbonews.com. Maybe de-list, but don't whitelist.
blah.com This was in a ProFTPd mailing list message "how do I set up a virtual host for "blah.com".
chartshop.com Astrology.com newsletter. People really do sign up for those things.
surveymonkey.com AFAICT, mostly legit surveys
send4fun.com Jokes, example was person to person links
topcities.com Free subdomain host.
whtirc.com, whtradio.com Web Hosting Talk newsletter. Yes, it's legit.
publicaster.com Used in some legit newsletters/mass media
resortvacationstogo.com Looks mostly legit, and they've been around for 1800+ days without any NANAS hits at all. Related to vacationstogo.com.
With the exception of attac.biz, I'd say whitelist the lot of these, unless anyone knows some reason why not. :-)
These all look relatively ok to me. I'm whitelisting them.
Note that there is some slight overlap with the "Large ham corpus" and another recent FP report, in the astrology, surveymonkey, and blah.com mentions.
Counter-arguments welcome, but most include some evidence that these don't appear in hams.
Jeff C.
Hi!
blah.com This was in a ProFTPd mailing list message "how do I set up a virtual host for "blah.com".
Uhm, if people need to type 'some' domain, its often blah.com or foo.com, thats common use, and the spammer using blah.com does take nice advantage of this.
These all look relatively ok to me. I'm whitelisting them.
Note that there is some slight overlap with the "Large ham corpus" and another recent FP report, in the astrology, surveymonkey, and blah.com mentions.
Counter-arguments welcome, but most include some evidence that these don't appear in hams.
Ohw well, if i get fresh ones from blah.com i'll let you know, so far i did not.
Bye, Raymond.
Raymond Dijkxhoorn wrote to Jeff Chan and SURBL Discussion list:
Hi!
blah.com This was in a ProFTPd mailing list message "how do I set up a virtual host for "blah.com".
Uhm, if people need to type 'some' domain, its often blah.com or foo.com, thats common use, and the spammer using blah.com does take nice advantage of this.
Yeah, and, unfortunately, there isn't much we can do about that to list them, since they'll definitely be used in ham.
- Ryan
On Saturday, September 11, 2004, 10:47:03 AM, Ryan Thompson wrote:
Raymond Dijkxhoorn wrote to Jeff Chan and SURBL Discussion list:
blah.com This was in a ProFTPd mailing list message "how do I set up a virtual host for "blah.com".
Uhm, if people need to type 'some' domain, its often blah.com or foo.com, thats common use, and the spammer using blah.com does take nice advantage of this.
Yeah, and, unfortunately, there isn't much we can do about that to list them, since they'll definitely be used in ham.
I'm making a list of generic placeholder domains that people use in examples, with the idea to whitelist them all to prevent FPs.
The good news is that most of them seem to be very old vanity registrations, so they're probably not going to be taken over by spammers any time soon. Here's a few I could think of:
domain.com domainname.com mydomain.com somedomain.com foo.com foobar.com example.com example.us
Can anyone think of others commonly used in examples?
Jeff C.
Hi Jeff, At 16:08 11-09-2004, Jeff Chan wrote:
domain.com domainname.com mydomain.com somedomain.com
Can anyone think of others commonly used in examples?
Yes, block all mail from people who use these domains as examples.
Regards, -sm
On Sat, Sep 11, 2004 at 04:08:15PM -0700, Jeff Chan wrote:
I'm making a list of generic placeholder domains that people use in examples, with the idea to whitelist them all to prevent FPs. The good news is that most of them seem to be very old vanity [...] Can anyone think of others commonly used in examples?
The same with org and net. And edu, but we can ignore those.
Are you sure we want to whitelist 'em all pre-emptively? Some of these combos seem to be for sale. (Want to buy spammer.net?)
Not sure it's "commonly" but off the top of my head:
baz (comes after foo, bar) localdomain internet intranet outside inside local site test evil good safe hacker cracker spammer blackhat us them friend friendly enemy hostile private secret secure black white red green one two three alpha beta gamma omega big acme wireless
On Sat, 11 Sep 2004, Jeff Chan wrote:
On Saturday, September 11, 2004, 10:47:03 AM, Ryan Thompson wrote:
Raymond Dijkxhoorn wrote to Jeff Chan and SURBL Discussion list:
blah.com This was in a ProFTPd mailing list message "how do I set up a virtual host for "blah.com".
Uhm, if people need to type 'some' domain, its often blah.com or foo.com, thats common use, and the spammer using blah.com does take nice advantage of this.
Yeah, and, unfortunately, there isn't much we can do about that to list them, since they'll definitely be used in ham.
I'm making a list of generic placeholder domains that people use in examples, with the idea to whitelist them all to prevent FPs.
Um, people, this precise issue is ancient history. The IETF community ran into this buzz-saw years ago, and there is now a set of reserved domain names -just- for sake of documentation & examples.
See RFC-2606, check out the registration info for "example.com".
So we just need to encourage people to use the reserved names (EG: "example.com") rather than 'foo/bar/blah.com'
On Saturday, September 11, 2004, 10:37:57 PM, David Funk wrote:
Um, people, this precise issue is ancient history. The IETF community ran into this buzz-saw years ago, and there is now a set of reserved domain names -just- for sake of documentation & examples.
See RFC-2606, check out the registration info for "example.com".
So we just need to encourage people to use the reserved names (EG: "example.com") rather than 'foo/bar/blah.com'
Thanks Dave. I'll add the ones mentioned in RFC 2606:
example.com example.net example.org
but we should also include common examples from everyday use, even if they're not technically RFC compliant.
Jeff C.
Jeff Chan wrote:
I'll add the ones mentioned in RFC 2606: example.com example.net example.org but we should also include common examples from everyday use, even if they're not technically RFC compliant.
Be very careful with this. I know that test.de and invalid.de are okay, but generally you can't fix all silly abuse ideas.
It's better to support 2606 (example.cno, *.example, *.test, *.invalid, *.localhost). For the TLDs that's no problem, you already know that example/test/invalid/localhost are reserved. And *.local isn't better (but not officially reserved).
But something like say local.tv, who knows ? Maybe they are spammers, maybe they don't exist, maybe they are okay. One SLD rarely or never abused might be NIC.* in all existing TLDs.
Bye, Frank
On Sunday, September 12, 2004, 11:04:57 AM, Frank Ellermann wrote:
Jeff Chan wrote:
I'll add the ones mentioned in RFC 2606: example.com example.net example.org but we should also include common examples from everyday use, even if they're not technically RFC compliant.
Be very careful with this. I know that test.de and invalid.de are okay, but generally you can't fix all silly abuse ideas.
We probably would not be whitelisting geographic names like these.
It's better to support 2606 (example.cno, *.example, *.test, *.invalid, *.localhost). For the TLDs that's no problem, you already know that example/test/invalid/localhost are reserved. And *.local isn't better (but not officially reserved).
We won't be whitelisting any entire TLDs, even these test ones.
But something like say local.tv, who knows ? Maybe they are spammers, maybe they don't exist, maybe they are okay. One SLD rarely or never abused might be NIC.* in all existing TLDs.
There's no capability for wildcarding whitelist entries, nor do I plan to add any. local.tv would be a geographic name which we would not generically whitelist.
Jeff C.
Jeff Chan wrote:
we should also include common examples from everyday use,
Okay, so now you have example.cno, what else ? IMHO the three example.cno form a complete list. There are no other common examples, and as you say local.* / test.* / example.* don't qualify as placeholders:
We probably would not be whitelisting geographic names like these.
local.museum, test.aero, or example.coop are not better than similar SLDs in any ccTLD. AFAIK some TLDs had a policy to reserve "all" TLDs as SLDs, but that was before the creation of new TLDs (aero / biz / coop / info / museum / name / pro).
Bye, Frank