On Friday, August 20, 2004, 9:24:10 AM, Mike Atkinson wrote:
On 8/19/2004 at 4:20 PM, Jeff Chan wrote:
Does anyone mind if we try this out and see what effect it has on traffic?
Go for it!
OK I'm changing the TTLs on all tue SURBL lists to 25 minutes which is a midway point between the 20 minutes I was going to try for SC and the 30 minutes we proposed for the rest of the lists. 25 minutes is also a good sampling frequency to capture data that has a 1 hour characteristic frequency.
This is mainly an experiment to see what effect it has on DNS traffic. If the traffic on SC is not a lot lower, I'll change it back to 10 minutes. If the traffic on the other lists is not a lot higher, I'll leave it at 25 minutes.
Jeff C.
On Friday, August 20, 2004, 3:11:20 PM, Jeff Chan wrote:
On Friday, August 20, 2004, 9:24:10 AM, Mike Atkinson wrote:
On 8/19/2004 at 4:20 PM, Jeff Chan wrote:
Does anyone mind if we try this out and see what effect it has on traffic?
Go for it!
OK I'm changing the TTLs on all tue SURBL lists to 25 minutes which is a midway point between the 20 minutes I was going to try for SC and the 30 minutes we proposed for the rest of the lists. 25 minutes is also a good sampling frequency to capture data that has a 1 hour characteristic frequency.
This is mainly an experiment to see what effect it has on DNS traffic. If the traffic on SC is not a lot lower, I'll change it back to 10 minutes. If the traffic on the other lists is not a lot higher, I'll leave it at 25 minutes.
Summary of different TTLs we've tried in the past two weeks:
Week 1: SC 10 minute TTLs, others zones 1 hour Week 2: All zones 25 minute TTLs
After one week of the zones at 25 minute TTLs, I think we can see the following patterns:
http://nmrl.kconline.com/rbldnsd/
SC traffic is about one sixth lower (2500 average weekday queries per 5 minutes versus about 3000)
Other formerly 1 hour zones are nearly unchanged or only slightly higher in traffic at the reduced 25 minute TTL.
multi is a special case since it's the default SpamAssassin 3.0 list. It seems to be rising, but that's perhaps best explained by growth in SA 3 adoption, especially given that the other individual lists like ws, be, ab, ob (which SA 3 does not use) look relatively unchanged.
So I'd like to propose reducing the TTLs further from 25 to 20 minutes. If traffic across all lists rises a lot, then we can probably say 25 to 20 minutes is a critical point. We know that traffic is significantly higher at 10 minutes from SC, but let's see if it goes up much at 20.
I'm going to go ahead and make this change now unless there are major objections. If the traffic at 20 is not up a lot, perhaps we'll try 15 minute TTLs next week. At some point it should rise and we would back off. But we don't need to try 10 minutes since we already know it results in higher traffic.
Comments?
BTW I'm aware that doing this on a Friday evening is not the best way to get feedback. ;-)
Jeff C.
Jeff,
How are these stats being taken and converted to mrtg? Been looking for something like this for a loooooong time... (for the non programmer)
Thanks
Alex
On Saturday, August 28, 2004, 12:48:21 AM, Alex Broens wrote:
How are these stats being taken and converted to mrtg? Been looking for something like this for a loooooong time... (for the non programmer)
Hi Alex, Mike Atkinson described how he did this on the zones list. Presumably he doesn't mind me copying it below.
Jeff C. __
Ok, first, rbldnsd needs some command line flags to cause it to dump some stats to a file. Normally the stats are logged at the same interval that the daemon checks for zone file updates and the stats are cumulative rather than per interval. So I inserted the following into my startup command line for rbldnsd:
-c 300 -s +rbldnsd.stat
('-c 300' sets the zone file update check and therefore the logging interval to 300 seconds. '-s +rbldnsd.stat' is the name of the log file that will be created in the rbldnsd directory, it won't take a path for some reason; the + at the beginning of the file name causes the output to be since the last output rather than cumulative since the program started.)
Some perl that could be more elegant to parse out the numbers we want; named 'rbldnsdstat.pl':
----<cut>----- #!/usr/bin/perl
#Put the location your RBLDNSD stats file next
$logfile = "/usr/local/etc/rbldnsd/rbldnsd.stat";
if (!@ARGV[0]) { print "\nUsage: rbldnsdstat.pl number_of_list_to_parse\n"; exit(1); }
$rbl_list = @ARGV[0];
# Some systems might need the full path for 'tail
$line = `tail -1 $logfile`;
@rbldnsd_data = split(/:/,$line);
#Get the data for the 1st list, or 2nd list, etc.
if ($rbl_list eq 1) { print "@rbldnsd_data[2]\n@rbldnsd_data[1]\n"; } elsif ($rbl_list eq 2) { print "@rbldnsd_data[7]\n@rbldnsd_data[6]\n"; } elsif ($rbl_list eq 3) { print "@rbldnsd_data[12]\n@rbldnsd_data[11]\n"; } elsif ($rbl_list eq 4) { print "@rbldnsd_data[17]\n@rbldnsd_data[16]\n"; } elsif ($rbl_list eq 5) { print "@rbldnsd_data[22]\n@rbldnsd_data[21]\n"; } elsif ($rbl_list eq 6) { print "@rbldnsd_data[27]\n@rbldnsd_data[26]\n"; } elsif ($rbl_list eq 7) { print "@rbldnsd_data[32]\n@rbldnsd_data[31]\n"; } elsif ($rbl_list eq 8) { print "@rbldnsd_data[37]\n@rbldnsd_data[36]\n"; } elsif ($rbl_list eq 9) { print "@rbldnsd_data[42]\n@rbldnsd_data[41]\n"; } elsif ($rbl_list eq 10) { print "@rbldnsd_data[47]\n@rbldnsd_data[46]\n"; } else { print "\nInput argument out of range..\n"; print "Edit the script if more than 9 (plus totals) lists to check...\n\n"; }
exit(0); ----<cut>-----
The MRTG conf file (The stats will be in the stats file in the same order that you have them listed in your startup command line with the overall totals being the last listing. You will have to edit the following Target, Titel, PageTop to match the names of the rbldns lists as you have them configured in your setup):
----<cut>----- ### Global Config Options
# for UNIX WorkDir: /www/rbldnsd
### Global Defaults
options[_]: gauge,growright,integer,noinfo,nopercent,nolegend,dorelpercent RunAsDaemon: Yes Interval: 5 # WithPeak[_]: ymw PageTop[^]: <FONT FACE="Arial"> YLegend[_]:RBLDNSD Queries ShortLegend[_]:Queries / 5 Minute LegendI[_]:Positive:  LegendO[_]:All Reqs: 
##################### Target[ws.surbl.org]: `perl /etc/mrtg/rbldnsdstat.pl 1` # The MaxBytes value is extra big to avoid problems.. MaxBytes[ws.surbl.org]: 4800000 Title[ws.surbl.org]: RBLDNSD - ws.surbl.org PageTop[ws.surbl.org]: <H2>ws.surbl.org dns requests </H2> ##################### Target[sc.surbl.org]: `perl /etc/mrtg/rbldnsdstat.pl 2` MaxBytes[sc.surbl.org]: 4800000 Title[sc.surbl.org]: RBLDNSD - sc.surbl.org PageTop[sc.surbl.org]: <H2>sc.surbl.org dns requests </H2> ##################### Target[be.surbl.org]: `perl /etc/mrtg/rbldnsdstat.pl 3` MaxBytes[be.surbl.org]: 4800000 Title[be.surbl.org]: RBLDNSD - be.surbl.org PageTop[be.surbl.org]: <H2>be.surbl.org dns requests </H2> ##################### Target[ob.surbl.org]: `perl /etc/mrtg/rbldnsdstat.pl 4` MaxBytes[ob.surbl.org]: 4800000 Title[ob.surbl.org]: RBLDNSD - ob.surbl.org PageTop[ob.surbl.org]: <H2>ob.surbl.org dns requests </H2> ##################### Target[ab.surbl.org]: `perl /etc/mrtg/rbldnsdstat.pl 5` MaxBytes[ab.surbl.org]: 4800000 Title[ab.surbl.org]: RBLDNSD - ab.surbl.org PageTop[ab.surbl.org]: <H2>ab.surbl.org dns requests </H2> ##################### Target[multi.surbl.org]: `perl /etc/mrtg/rbldnsdstat.pl 6` MaxBytes[multi.surbl.org]: 4800000 Title[multi.surbl.org]: RBLDNSD - multi.surbl.org PageTop[multi.surbl.org]: <H2>multi.surbl.org dns requests </H2> ##################### Target[kc-cbl.surbl.org]: `perl /etc/mrtg/rbldnsdstat.pl 7` MaxBytes[kc-cbl.surbl.org]: 4800000 Title[kc-cbl.surbl.org]: RBLDNSD - kc-cbl.surbl.org PageTop[kc-cbl.surbl.org]: <H2>kc-cbl.surbl.org dns requests </H2> ##################### Target[total.surbl.org]: `perl /etc/mrtg/rbldnsdstat.pl 8` MaxBytes[total.surbl.org]: 4800000 Title[total.surbl.org]: RBLDNSD - total.surbl.org PageTop[total.surbl.org]: <H2>total.surbl.org dns requests </H2> ----<cut>-----
OK We tried TTLs at 20 minutes for one week. Now for completeness let's try them at 15 minutes.
We already know from SC that at 10 minutes the queries are up very significantly (about 20% compared to 20 or 25 minutes).
At 20 minutes the traffic doesn't seem much different from 25 minutes or only slightly higher, so let's see if traffic goes up much at 15 minute TTLs. If so then 20 minutes is probably optimal. If not, then 15 minutes is probably optimal.
http://nmrl.kconline.com/rbldnsd/
Either way TTL tuning will be done after next week.
Jeff C.
OK we tried one week of TTLs at 15 minutes (after a week at 25 then 20 minutes) and the DNS traffic levels don't seem much higher because of it:
http://nmrl.kconline.com/rbldnsd/
The measurement is perhaps slightly complicated by the fact that people may be moving over to multi.surbl.org with SpamAssassin 3.0, but looking at the relatively unchanging lists, the shorter TTL change seems to have not had much effect on traffic or only increased it slightly. (Note how traffic on multi is growing while the others (presumably using SA 2.63 and 2.64 or urirhsbl with SA 3.0) essentially aren't.)
We already know the traffic is significantly higher at 10 minute TTLs, from the sc.surbl.org setting before (in the first week on the sc graph), so 15 minutes is probably optimal in terms of quickest additions and deletions from the list while also minimizing DNS traffic.
So I'd like to propose that we stick with 15 minute TTLs for all SURBLs.
Any comments?
Are we on the right track with this?
Jeff C.