How about transforming a Geocities URL into a DNS query like this:
becomes
xyz.geocities.com.multi.surbl.org
And then we could SURBL check the individual pages within Geocities. In a way geocities.com is just another TLD.
John.
On Thursday, December 1, 2005, 12:38:22 AM, John Graham-Cumming wrote:
How about transforming a Geocities URL into a DNS query like this:
http://uk.geocities.com/xyz/
becomes
xyz.geocities.com.multi.surbl.org
And then we could SURBL check the individual pages within Geocities. In a way geocities.com is just another TLD.
John.
Yes, that's an interesting suggestion, that IIRC, has been proposed before. It would require rewriting SURBL applications and some work on the data side also. It would also require a way to signal that paths should be checked and at what level in the query (i.e., what subdomain level) the path information is encoded.
My quick answer is that it's not the type of information we originally designed the systems to capture and check, and that the best solution would be for Yahoo to properly deal with their abuse issues. Hopefully they will start doing a better job of that.
Jeff C. -- Don't harm innocent bystanders.
Jeff Chan wrote:
Yes, that's an interesting suggestion, that IIRC, has been proposed before.
Apologies if it has, I did read the threads on geocities in the archive, but couldn't see this as a possible solution.
It would require rewriting SURBL applications and some work on the data side also. It would also require a way to signal that paths should be checked and at what level in the query (i.e., what subdomain level) the path information is encoded.
All true, but I think the SURBL data itself would be backwards compatible with users who don't implement the geocities code since they'd just check geocities.com against the SURBL and come up ok.
John.
On Thursday, December 1, 2005, 1:58:39 AM, John Graham-Cumming wrote:
Jeff Chan wrote:
It would require rewriting SURBL applications and some work on the data side also. It would also require a way to signal that paths should be checked and at what level in the query (i.e., what subdomain level) the path information is encoded.
All true, but I think the SURBL data itself would be backwards compatible with users who don't implement the geocities code since they'd just check geocities.com against the SURBL and come up ok.
Yes, it probably would be backward compatible for most applications.
Jeff C. -- Don't harm innocent bystanders.
John Graham-Cumming wrote:
I did read the threads on geocities in the archive, but couldn't see this as a possible solution.
http://article.gmane.org/gmane.mail.spam.rbl.surbl/4889
Let's see how Jeffs tripod experiment works, there are tons of geocities URLs in spam, I get at least 10 per day.
SpamCop still almost alway refuses to LART Yahoo, so that idea won't work as expected on sc.surbl.org at the moment.
Maybe score the mere string "geocities" like "viagra" (?)
Bye, Frank
On Friday, December 2, 2005, 7:24:19 AM, Frank Ellermann wrote:
John Graham-Cumming wrote:
I did read the threads on geocities in the archive, but couldn't see this as a possible solution.
Let's see how Jeffs tripod experiment works, there are tons of geocities URLs in spam, I get at least 10 per day.
SpamCop still almost alway refuses to LART Yahoo, so that idea won't work as expected on sc.surbl.org at the moment.
Maybe score the mere string "geocities" like "viagra" (?)
Bye, Frank
Farnk brings up a very good practical point. There may be hundreds or thousands of new Tripod and Geocities accounts being set up *per day* by spammers. ***Since these hosting accounts are free, they seem to be created and abused at a much higher rate than domain names.*** That means it may not be practical to try to blacklist them all, due to the large volume of additions.
Therefore it may be better to set up a rule for them, and/or for Yahoo and Lycos to do the right thing and get more active about shutting the accounts down.
Cheers,
Jeff C. -- Don't harm innocent bystanders.
Frank Ellermann wrote:
John Graham-Cumming wrote:
I did read the threads on geocities in the archive, but couldn't see this as a possible solution.
http://article.gmane.org/gmane.mail.spam.rbl.surbl/4889
Let's see how Jeffs tripod experiment works, there are tons of geocities URLs in spam, I get at least 10 per day.
SpamCop still almost alway refuses to LART Yahoo, so that idea won't work as expected on sc.surbl.org at the moment.
Maybe score the mere string "geocities" like "viagra" (?)
Bye, Frank
Since it doesn't look like this will be implemented anytime soon, I've made my own auto-generated spamassassin rules for both Geocities and Tripod spam.
This list is similar in it's principles to the good old BigEvilList ...
You can download and test it from there: http://nospam.mailpeers.net/
Feedback appreciated (good or bad, in or outside of the list) ...
Hi!
Since it doesn't look like this will be implemented anytime soon, I've made my own auto-generated spamassassin rules for both Geocities and Tripod spam.
This list is similar in it's principles to the good old BigEvilList ...
You can download and test it from there: http://nospam.mailpeers.net/
We are working with Tripod to get all cleaned out so if you have WORKING ones still let us know and we'll pass them on. Most is allready cleaned out.
Bye, Raymond.
On Friday, December 2, 2005, 3:39:56 PM, Raymond Dijkxhoorn wrote:
Hi!
Since it doesn't look like this will be implemented anytime soon, I've made my own auto-generated spamassassin rules for both Geocities and Tripod spam.
This list is similar in it's principles to the good old BigEvilList ...
You can download and test it from there: http://nospam.mailpeers.net/
We are working with Tripod to get all cleaned out so if you have WORKING ones still let us know and we'll pass them on. Most is allready cleaned out.
AWESOME! Get 'em for us! :D
Jeff C. -- Don't harm innocent bystanders.
Raymond Dijkxhoorn wrote:
Since it doesn't look like this will be implemented anytime soon, I've made my own auto-generated spamassassin rules for both Geocities and Tripod spam. This list is similar in it's principles to the good old BigEvilList ... You can download and test it from there: http://nospam.mailpeers.net/
We are working with Tripod to get all cleaned out so
Great !
if you have WORKING ones still let us know and we'll pass them on. Most is allready cleaned out.
Most of the URLs in my list are still alive and redirecting ...
I added an indicator in the 'describe' field telling you those that are alive or closed and the score is now higher for closed ones for 4 days before they are cleared from the list (unlikely a spammer will spam using a closed site after more than 4 days). Those that are 'temporarily unavailable' (503) are not counted as closed.
The list is updated every hour.
Check it there: http://nospam.mailpeers.net/subevil.cf
Other spammy URLs can now be submitted on the main page at http://nospam.mailpeers.net/
Waiting for feedback ...
Bye, Raymond.
Eric Montréal wrote:
Raymond Dijkxhoorn wrote:
Since it doesn't look like this will be implemented anytime soon, I've made my own auto-generated spamassassin rules for both Geocities and Tripod spam. This list is similar in it's principles to the good old BigEvilList ... You can download and test it from there: http://nospam.mailpeers.net/
We are working with Tripod to get all cleaned out so
Great !
if you have WORKING ones still let us know and we'll pass them on. Most is allready cleaned out.
Most of the URLs in my list are still alive and redirecting ...
Eric could you post a clear text list of those active sites
It would help to report to Lycos inc ase we're missing some
thanks
Alex
Alex Broens wrote:
Eric Montréal wrote:
Most of the URLs in my list are still alive and redirecting ...
Eric could you post a clear text list of those active sites
It would help to report to Lycos inc ase we're missing some
Here is the list :
http://nospam.mailpeers.net/alive_spammy.txt
WARNING : following those links with a browser might not be safe. either use wget or at least, disable java / javascript before visiting.
The list is updated once per hour, please don't download more often than that.
If you have other spamvertised URLs at geocities or Tripod, please submit them using the form at http://nospam.mailpeers.net/
thanks
Enjoy !
Eric
On Friday, December 2, 2005, 3:31:41 PM, Eric Montréal wrote:
I've made my own auto-generated spamassassin rules for both Geocities and Tripod spam.
This list is similar in it's principles to the good old BigEvilList ...
You can download and test it from there: http://nospam.mailpeers.net/
Feedback appreciated (good or bad, in or outside of the list) ...
This would work, but it could be somewhat difficult to maintain and distribute. Note that that doesn't mean I think it's a bad idea. Anything that reasonably stops spam is a positive IMO.
Jeff C. -- Don't harm innocent bystanders.
Jeff Chan wrote:
On Friday, December 2, 2005, 3:31:41 PM, Eric Montréal wrote:
I've made my own auto-generated spamassassin rules for both Geocities and Tripod spam.
This list is similar in it's principles to the good old BigEvilList ...
You can download and test it from there: http://nospam.mailpeers.net/
Feedback appreciated (good or bad, in or outside of the list) ...
This would work, but it could be somewhat difficult to maintain
The list generation / maintenance is somewhat automated and will be even more if there is enough interest for it.
What I would need now is a broader set of URLs, since I only have a partial view of what's going on at a global level.
and distribute.
Not sure what you mean by distribution, but if successful, it might need mirrors.
Note that that doesn't mean I think it's a bad idea. Anything that reasonably stops spam is a positive IMO.
Should work as the BigEvil list worked ... until it became too big and you found a way to solve the problem ...
In the meantime those rules are easy to add and don't require any change in Spamassassin.
Jeff C.
Don't harm innocent bystanders.
I'll do my best ;-)
Eric
Hi All,
Any feedback on how effective this is ?
Regards Warren
----- Original Message ----- From: "Eric Montréal" erv@mailpeers.net To: "Jeff Chan" jeffc@surbl.org; "SURBL Discussion list" discuss@lists.surbl.org Sent: Sunday, December 04, 2005 11:57 AM Subject: Re: [SURBL-Discuss] Re: One way to handle the Geocities spam
Jeff Chan wrote:
On Friday, December 2, 2005, 3:31:41 PM, Eric Montréal wrote:
I've made my own auto-generated spamassassin rules for both Geocities and Tripod spam.
This list is similar in it's principles to the good old BigEvilList ...
You can download and test it from there: http://nospam.mailpeers.net/
Feedback appreciated (good or bad, in or outside of the list) ...
This would work, but it could be somewhat difficult to maintain
The list generation / maintenance is somewhat automated and will be even more if there is enough interest for it.
What I would need now is a broader set of URLs, since I only have a partial view of what's going on at a global level.
and distribute.
Not sure what you mean by distribution, but if successful, it might need mirrors.
Note that that doesn't mean I think it's a bad idea. Anything that reasonably stops spam is a positive IMO.
Should work as the BigEvil list worked ... until it became too big and you found a way to solve the problem ...
In the meantime those rules are easy to add and don't require any change in Spamassassin.
Jeff C.
Don't harm innocent bystanders.
I'll do my best ;-)
Eric _______________________________________________ Discuss mailing list Discuss@lists.surbl.org http://lists.surbl.org/mailman/listinfo/discuss
Hi,
Warren Robinson wrote:
Hi All,
Any feedback on how effective this is ?
...
I've made my own auto-generated spamassassin rules for both Geocities and Tripod spam.
This list is similar in it's principles to the good old BigEvilList ...
You can download and test it from there: http://nospam.mailpeers.net/
Feedback appreciated (good or bad, in or outside of the list) ...
I took a look some time before, and I noticed some 440 rules, if I remember.
The problem with this is that it may be efficient only for small servers, and you should clean up old unused rules.
What's nice with URLBL is that you read the message once to extract all URLs and after that, you query the database looking for the domain names you've found.
The problem with your rules is that the message will probably be scanned 441 times looking for complex perl expressions.
But this is a general problem with SpamAssassin with its hundreds rules.
In other words, with your rules, you'll highly increase the scanning time without adding too much efficiency to your filter. Good for small servers !
Hi,
Jose Marcio Martins da Cruz wrote:
I took a look some time before, and I noticed some 440 rules, if I remember.
The ruleset is between 300 and 500 rules.
The problem with this is that it may be efficient only for small servers, and you should clean up old unused rules.
URLs that are shut down (404) or redirected to 'error' or 'policy violation' pages are automatically removed after 4 days.
When I started, I thought that would be enough, since after a while the hosts (Geocities / Tripod) would cancel the spammy domains.
It works wonders with Tripod who usually have already shut down the offending sites before I list them, maybe due to the work of Raymond Dijkxhoorn, but the situation with Geocities is much worse than I expected, to say the least.
If you have a look at the current list http://nospam.mailpeers.net/subevil.cf you'll see that among 372 rules only 2 are for Tripod (there might be a sampling problem here, or spammers are fed up with their pages being destroyed before the spam is even completely sent and stopped abusing Tripod !) and the 370 others are from Geocities.
Among them, *** only 6 *** have been closed in the past 4 days ! that's less than 2%.
No wonders spammers have been using Yahoo / Geocities for months and will keep doing so !
I also noticed most of the pages look very similar, so only a few spammers are using it. This similarity makes the automated detection of their encoded relocation scripts trivial in about 85% of cases, but it leads to new questions about the exact relationship between them and Geocities / Yahoo.
The complete list of spammy redirection pages Yahoo / Geocities is still hosting can be found here: http://nospam.mailpeers.net/alive_spammy.txt Anyone knows where it should be sent @ Yahoo ?
------
If the situation does not improve, the list will obviously grow, and I'll have to figure out if old addresses are 'recycled' in new spam runs or not.
I just added a length limited version (only the most recent 200 addresses) to address your server performance concerns while keeping most of it's effectiveness. http://nospam.mailpeers.net/subevil200.cf
If there is enough interest, I can also add the code for rules merging. That would greatly improve both memory usage and speed.
What's nice with URLBL is that you read the message once to extract all URLs and after that, you query the database looking for the domain names you've found.
The problem with your rules is that the message will probably be scanned 441 times looking for complex perl expressions.
But this is a general problem with SpamAssassin with its hundreds rules.
Yes, the explosion of bigevil.cf (over 1M!) and the need for a more efficient way was one of the main reason SURBL was created.
The reasons why those Geocities sites won't be integrated in SURBL were previously discussed. I still think it would be possible, but, at least for now, .cf ruleset is the only way.
Also, contrary to bigevil, it should not expand indefinitely, but a large part of the problem (and the solution) is in the hands of Yahoo / Geocities. If they start cleaning the mess, we'll have less, and their service will become less attractive for spammers. That's where I see the next battle.
In other words, with your rules, you'll highly increase the scanning time without adding too much efficiency to your filter. Good for small servers !
The problem with those Geocities spams is they trend to generate a very low score and go undetected. Sure, the rules won't hit a lot of the total server mail load, but those they'll hit will contribute significantly to the lowering of false negatives (15 to 20% in my server, that's what got me interested in a solution).
In servers such as mine where I have a 4 levels response (3 levels of tagging + quarantine if > 20) it also help the mildly spammy ones to get above 20 so my users won't even be bothered by them.
One feature that might boost the performance for this kind of rules would be conditional rules skipping.
With a single test on (Geocities|tripod), it would be fast & easy to skip all the other tests for the 95% of all mails that reference neither of these sites.
Improving Spamassassin's performance issues is very important, but beyond the scope of my simple ruleset.
Regards,
Eric.
Warren Robinson a écrit :
Hi All,
Any feedback on how effective this is ?
it would be good to share these. most of those I've looked at are of the form *.geocities.com/*/? so I can just block these since /? is very rare in general.
of course, it would be better if yahoo track the args to any page to detect those whioch corrspond to the actual spam, and they could also run a content filter against those which get hit too much.
mouss wrote:
Warren Robinson a écrit :
Hi All, Any feedback on how effective this is ?
it would be good to share these. most of those I've looked at are of the form *.geocities.com/*/? so I can just block these since /? is very rare in general.
The rule called GeocitiesRd in the rule set does just that : http://nospam.mailpeers.net/subevil.cf
of course, it would be better if yahoo track the args to any page to detect those whioch corrspond to the actual spam,
I agree, but the new spams generally don't use this tracking system anymore, so it's becoming less useful ...
the encoded javascript redirector most of those pages contain is a piece of cake to detect, and as the saying goes, when there's a will, there's a way.
I even prepared a list of live spam urls to make it easier for them : http://nospam.mailpeers.net/alive_spammy.txt
When will Yahoo / Geocities stop protecting spammers on their network ?
Only Yahoo can answer the question, and we've already been waiting far too long ...
and they could also run a content filter against those which get hit too much.
I suspect they don't get that many hits and it would not be a good indicator, If they had a sizable charge, the sites would be 'temporarily disabled' since the hourly traffic on free Geocities sites is capped around 3 megs / hour ...
Eric