Webmasters need to know if and when their sites are down or unreachable. I manage several web sites hosted on shared servers, that have all been intermittently unreachable for short periods of time due to mysterious problems the web hosting company couldn't identify. Those types of problems can be almost impossible to diagnose. I wanted to devise a way to monitor and if not identify a specific problem at least tell me when a particular web site was unreachable so I could inform tech support a site was indeed experiencing a problem.
One solution is to use a tool like Nagios (www.nagios.org), a full-featured network services monitoring tool. However, I don't always want to use one of my home machines for continuous network monitoring. Crafting a lightweight Perl script to do the basics of web site health monitoring is a good compromise solution. The script can be quickly and easily installed just about anywhere.
The algorithm is pretty simple: attempt periodic connection with a web site of interest and record results in a log file. We could simply develop a shell script that pings a domain and saves the resultant output in a log, but to be of real use we really want something not only capable of a bit more richness but something capable of future expansion if we want to get fancy and incorporate more intricate diagnostics. Perl is a good choice for this, not only because it offers more basic capability than shell script but because Perl's LWP (Library for Web Programming) module is an excellent choice for automatically working with HTTP. I don't work much with Perl, but this type of small, handy-dandy utility is where the language excels.
Let's go through the program step-by-step. At the end, we'll show the complete program.
#!/usr/bin/perl
# wping.pl - a perl script to monitor accessibility of web pages/sites.
use LWP; # Loads LWP classes.
use Fcntl ':flock'; # needed for file locking
$browser = LWP::UserAgent->new;
Next, we retrieve the command line arguments and save the process ID (PID) using Perl's "$$" variable:
($url, $pause) = @ARGV;
($pid) = $$;
We now establish the location of the log file the program will use, checking Perl's $ENV hash for the presence of an environment variable called WPING_LOGFILE_DIR. If WPING_LOGFILE_DIR is not in the environment, then the program defaults to using the home directory ($ENV{HOME}):
# set up log file
$logfile_dir = $ENV{WPING_LOGFILE_DIR};
if (! $logfile_dir) {
$home = $ENV{HOME};
$logfile = $home . "/wping.log";
} else {
$logfile = $logfile_dir . "wping.log";
}
Where does WPING_LOGFILE_DIR get created? I created the ~/logs directory and put the following in my home directory's .profile configuration file:
if [ -d "$HOME/logs" ] ; then
WPING_LOGFILE_DIR="$HOME/logs/"
export WPING_LOGFILE_DIR
fi
Next, we output a message to the log file, showing the url we are about to probe. All log file entries are timestamped. I use a subroutine called get_timestamp(), which you will find in the completed source code at the end of this article. Also, notice the use of $pid in the output message. I use the process ID (PID) because we may want to run wping multiple times to probe different domains. Since output from all of them goes into the same log file, including the PID will allow us to differentiate which instances are producing which output when we later analyze the log file. Also, since we may run multiple instances, we should lock the log file so only one process at a time can write to it. If the lock isn't acquired for this initial message we don't worry about it and move on with execution, it isn't critical to have the startup message in the log file:
# output start of probe to log file
open(INFO, ">>$logfile");
if (flock(INFO, LOCK_EX)) {
$timestamp = get_timestamp();
print INFO $timestamp . " ** wping started for ". $url." ** $pid\n";
close(INFO);
}
Finally, we get to the main loop that does the work of probing, saving status to the log file and then sleeping for the time period we specified in the third command line argument. Notice that if a file lock isn't acquired, the program simply continues execution; a gap in status messages can be attributed to failure to acquire the lock:
# continually probe domain, pausing $pause number of seconds between each probe.
while (true) {
$response = $browser->get( $url );
$timestamp = get_timestamp();
if ($response->is_success) {
$status_msg = $timestamp . " -- $url -- " . $response->status_line . " -- $pid\n";
open(INFO, ">>$logfile");
if (flock(INFO, LOCK_EX)) {
print INFO $status_msg;
close(INFO);
}
} else {
$status_msg = $timestamp . " -- $url -- " . $response->status_line . " -- $pid\n";
$code = $response->code( ); # not used, included to show how to retrieve HTTP code.
open(INFO, ">>$logfile");
if (flock(INFO, LOCK_EX)) {
print INFO $status_msg;
close(INFO);
}
}
sleep($pause);
} # end while
You may notice that the else clause seems redundant. The code doesn't have to be structured this way, since $response->status_line simply contains an HTTP return code. The reason for this structure, is to be able to place some additional functionality inside the code block executing when a probe is unsuccessful. The most obvious thing to include is an automatic email notification to the web master. We could also attempt to perform more detailed diagnosis in the else block. The nice abstraction of $response->is_success is one of LWP's advantages.
I like to put programs like wping.pl in my local $HOME/bin directory. I also like making a symbolic link to them, so I can invoke them without typing the ".pl" suffix yet still have files containing the suffix. So, assuming you've placed this program in your $HOME/bin directory (and have created code in your .profile to add $HOME/bin to your $PATH environment variable,) you would then do something like:
$cd $HOME/binNow we can run the program:
$wping http://www.oreilly.net 11 &This sets wping to probe oreilly.net every 11 seconds and notice in this case the program is running as a background process. If you want to SSH into a shared sever to initiate wping, you will need to run it in the background, else it will die when you exit SSH. You can also initiate wping as a cron job, but remember wping will run until it is killed. You can easily modify the code to accept a parameter limiting how long wping runs. The choice is yours. Because we are using a flexible dynamic language you can get as elaborate as possible, at your discretion. This version of wping is a skeleton to get you started.
No tool is infallible. If for some reason there is a problem on the server running wping, you may get a false positive test on a probe. What you can then do is run two instances of wping from different servers and domains, comparing results from both. wping, as presented here, doesn't attempt to perform detailed diagnosis of the HTTP problems encountered; you could certaining modify it to attempt more detailed diagnosis.
#!/usr/bin/perl
# wping.pl - a perl script to monitor accessibility of web pages/sites.
# Usage: wping.pl http://www.foo.com pause
#
# Where http://www.foo.com is the web site to monitor and pause is how long
# you want to pause between probes (in seconds)
# Copyright (C) 2007 Adrien Lamothe
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
use LWP; # Loads LWP classes.
use Fcntl ':flock'; # needed for file locking
$browser = LWP::UserAgent->new;
($url, $pause) = @ARGV;
($pid) = $$;
# set up log file
$logfile_dir = $ENV{WPING_LOGFILE_DIR};
if (! $logfile_dir) {
$home = $ENV{HOME};
$logfile = $home . "/wping.log";
} else {
$logfile = $logfile_dir . "wping.log";
}
# output start of probe to log file
open(INFO, ">>$logfile");
if (flock(INFO, LOCK_EX)) {
$timestamp = get_timestamp();
print INFO $timestamp . " ** wping started for ". $url." ** $pid\n";
close(INFO);
}
# continually probe domain, pausing $pause number of seconds between each probe.
while (true) {
$response = $browser->get( $url );
$timestamp = get_timestamp();
if ($response->is_success) {
$status_msg = $timestamp . " -- $url -- " . $response->status_line . " -- $pid\n";
open(INFO, ">>$logfile");
if (flock(INFO, LOCK_EX)) {
print INFO $status_msg;
close(INFO);
}
} else {
$status_msg = $timestamp . " -- $url -- " . $response->status_line . " -- $pid\n";
$code = $response->code( ); # not used, included to show how to retrieve HTTP code.
open(INFO, ">>$logfile");
if (flock(INFO, LOCK_EX)) {
print INFO $status_msg;
close(INFO);
}
}
sleep($pause);
} # end while
sub get_timestamp {
($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst) = localtime(time);
sprintf "%4d-%02d-%02d %02d:%02d:%02d", $year+1900, $mon+1, $mday, $hour, $min, $sec;
}