steps to visualize HTTP server logging – part II
gnuplot 2d
edit /etc/apache2/mod_log_config.conf and add a new logformat (plots) – (“mod_log_config.conf” is referenced by “httpd.conf”)
LogFormat “%h %l %u %t \”%r\” %>s %b \ \”%{Referer}i\” \”%{User-Agent}i\”" combined
LogFormat “%h %{%d.%m.%Y:%H:%M:%S}t %D %U” plots
The characteristics of the “%” directives are:
%…h Remote host
%…{format}t time & date
%…D time taken to serve the request, in microseconds.
%…U URL path requested
Add another “customlog” directive to your “virtualhost” section in “httpd.conf”:
DocumentRoot /home/h/hensler.net/public_html/bernhard/
ServerName bernhard.hensler.net
IndexOptions
DirectoryIndex index.htm index.html index.shtml start.htm start.html start.shtm index.phpCustomLog “/usr/local/visas/logfiles/hensler.net/%Y/%m/%d/access_log” vhost_combined
CustomLog “/usr/local/visas/logfiles/hensler.net/bernhard.access_log” plots
Concatenate logs from all virtual hosts e.g.: cat hensler.access_log niko.access_log bernhard.access_log max.access_log > plot_log (sample line: 66.249.111.111 30.08.2009:14:15:17 4372853 /blog/) and start gnuplot from the command line:
$ gnuplot
reset
set terminal png small color
set output “2dplot.png”
set title “average response time”set style data points
set pointsize 1
set gridset xlabel “time”
set timefmt “%d.%m.%Y:%H:%M:%S”
set format x “%H:%M\n%d/%b”
set xdata time
set xrange [ "30.08.2009:00:00" : "30.08.2009:23:59" ]set ylabel “response time”
set yrange [ 0 : 10000 ]plot “/usr/local/visas/logfiles/hensler.net/plot_log” using 2:3 title “2d”
gnuplot 3d
Then read this excellent article about “A New Visualization for Web Server Logs” and create a perl script:
#
# prepare-for-gnuplot.pl: convert access log files to gnuplot input
# Raju Varghese. 2007-02-03use strict;
my $tempFilename = “./tmp/temp.dat”;
my $ipListFilename = “./tmp/iplist.dat”;
my $urlListFilename = “./tmp/urllist.dat”;my (%ipList, %urlList);
sub ip2int {
my ($ip) = @_;
my @ipOctet = split (/\./, $ip);
my $n = 0;
foreach (@ipOctet) {
$n = $n*256 + $_;
}
return $n;
}# prepare temp file to store log lines temporarily
open (TEMP, “>$tempFilename”);# reads log lines from stdin or files specified on command line
while (<>) {
chomp;
my ($ip, $time, $D, $url, $sc) = split;
$time =~ s/\[//;
next if ($url =~ /(gif|jpg|png|js|css)$/);
print TEMP "$ip $time $D $url $sc\n";
$ipList{$ip}++;
$urlList{$url}++;
}# process IP addresses
my @sortedIpList = sort {ip2int($a) <=> ip2int($b)} keys %ipList;
my $n = 0;
open (IPLIST, ">$ipListFilename");
foreach (@sortedIpList) {
++$n;
print IPLIST "$n $ipList{$_} $_\n";
$ipList{$_} = $n;
}
close (IPLIST);# process URLs
my @sortedUrlList = sort {$urlList {$b} <=> $urlList {$a}} keys %urlList;
$n = 0;
open (URLLIST, ">$urlListFilename");
foreach (@sortedUrlList) {
++$n;
print URLLIST "$n $urlList{$_} $_\n";
$urlList{$_} = $n;
}
close (URLLIST);close (TEMP); open (TEMP, $tempFilename);
while () {
chomp;
my ($ip, $time, $D, $url, $sc) = split;
print "$time $ipList{$ip} $urlList{$url} $sc\n";
}
close (TEMP);
Run this perl script and redirect output to a file from the command line:
$ perl gnuplot.pl "/usr/local/visas/logfiles/hensler.net/bernhard.access_log" > gnuplot.input
The fields in gnuplot.input, the output file of the Perl script, are date/time, ip rank, url rank.
Run gnuplot from the command line: $ gnuplot and the following commands:
reset
set terminal png small color
set output "3dplot.png"
set style data dots
set xdata time
set timefmt "%d.%m.%Y:%H:%M:%S"
set zlabel "Content"
set ylabel "IP address"
splot "gnuplot.input" using 1:2:3 title "3d"
Image taken from oreillynet, my website is not producing sufficient data …
- X, the time axis–a full day from midnight to midnight of November 16.
- Y, the requester’s IP address, with the conventional dotted decimal format sorted and given an ordinal number between 1 and 120,000, representing the number of clients that accessed the web server.
- Z, the URL (or content) sorted by popularity. Of the approximately 60,000 distinct pages on the site, the most popular URLs are near the zero point of the Z-axis and the least popular ones at the top.
http://www.ibm.com/developerworks/linux/library/lgnuplot
http://www.oreillynet.com/pub/a/sysadmin/2007/02/02/3d-logfile-visualization.html?page=1
http://phasorburn.com/index.php/archive/excel-0-gnuplot-1
A final step will cover loadrunner tools like openSTA and jmeter.
See also Part I of this tutorial.
Comments (One comment)
[...] link to part II of the [...]
steps to enable extended HTTP server logging part i | bernhard.hensler.net / September 1st, 2009, 19:57 / #
Post a comment