Overview

System monitoring in this document is based on technologies available in most variants of Linux. The goal of this howto is to provide a simple method for monitoring system elements such as network activity and CPU load. If you're technically minded, skip to the summary to see the complete layout.

Source

Performance metrics can be retrieved from a variety of sources, most notably snmp. This method is interchangable -- how and what you track is up to you. Let's start with CPU load since it is a fairly important gauge of system performance. CPU load can be found in top, retrieved via snmp, or even the simple uptime command. Most sources will show the load averages for the last minute, 5 minutes, and 15 minutes. Uptime is simple, so we'll use that source. To monitor the system, we need to store the performance metrics and generate graphs on the stored information.

Storing Performance Metrics

We want to store the load averages, so we need to use the stream editor to parse out the specific load numbers. Use a text editor to create a file named loadAverage with the following contents:

uptime | sed -e 's/.* load average: \(.*\), \(.*\), \(.*\)/\1 \2 \3/'

Change file access permissions to loadAverage and execute it by:

chmod 755 loadAverage
./load

Compare this output with uptime. Uptime provides several pieces of information, including the load averages at the end. Executing the loadAverage script pipes the uptime information to the stream editor, which returns only the relevant load numbers.

Now we're ready to insert those values into a database. Use a text editor to create a file named creation.sql and containing the following text:

CREATE DATABASE `system`;
USE `system`;
CREATE TABLE `loadAverage` (
  `date` timestamp NOT NULL default CURRENT_TIMESTAMP,
  `load1` float default NULL,
  `load5` float default NULL,
  `load15` float default NULL
);
grant all privileges on system.* to user@"localhost" identified by 'user';
flush privileges;

The bottom two lines of the file grant access to the database tables for "user". You need to ensure that your webserver username is added here so the data can be retrieved. Execute the creation script by:

mysql -u root < creation.sql

With the database created, we can use the stream editor (yet again) to create the SQL insertion command.


uptime | sed -e 's/.* load average: \(.*\), \(.*\), \(.*\)/INSERT INTO loadAverage (load1,load5,load15) VALUES (\1,\2,\3);/'

If you're familiar with SQL, this command will output something familiar to you. To store the CPU load numbers into the database, we can pipe the command to mysql.


uptime | sed -e 's/.* load average: \(.*\), \(.*\), \(.*\)/INSERT INTO loadAverage (load1,load5,load15) VALUES (\1,\2,\3);/' | mysql -u root system

You can test your database by viewing all the table contents, counting the number of records, and if you choose, empty the table.

mysql -u root system -e "SELECT * FROM loadAverage;"
mysql -u root system -e "SELECT COUNT(*) FROM loadAverage;"
mysql -u root system -e "DELETE FROM loadAverage;"

To take regular measurements using the loadAverage script, we should create another file that makes a call to loadAverage. Use a text editor to create another file named monitor that contains the following contents:

cd /path/to/your/script/
./loadAverage

Using crontab, call the script every 5 minutes. Add the following line to your crontab - note that your path will vary.

*/5 * * * * /path/to/your/script/monitor >> /dev/null 2>&1

We're halfway now. The performance metrics are being stored in the database using a single line in a bash script that is called every 5 minutes via cron.

Generate Graphs on Stored Performance Metrics

All we need to do is retrieve the stored data and send it to gnuplot. There is a basic template for data retrieval already available. You'll want to save the file as gnuplot.php somewhere in your webserver's document root directory. The page requires a couple of parameters -- table name and duration of data to retrieve. If you want to see the data formatted for gnuplot, bring this page up in a web browser using the following syntax:

http://localhost/gnuplot.php?table=loadAverage&time=hour&c1=load1&c2=load5&c3=load15

This retrieves the last hour of data from the mysql table. Alternately, you can specify the time to be day, week, month, and year. Use a text editor to create a file named gnuplot and containing the following text:

cd /path/to/your/script/
lynx -dump "http://localhost/monitor/gnuplot.php?table=$1&time=hour&c1=$2&c2=$3&c3=$4" | gnuplot
lynx -dump "http://localhost/monitor/gnuplot.php?table=$1&time=day&c1=$2&c2=$3&c3=$4" | gnuplot
lynx -dump "http://localhost/monitor/gnuplot.php?table=$1&time=week&c1=$2&c2=$3&c3=$4" | gnuplot
if date "+%M" | grep "00"
  then lynx -dump "http://localhost/monitor/gnuplot.php?table=$1&time=month&c1=$2&c2=$3&c3=$4" | gnuplot
fi
if date "+%H%M" | grep "0005"
  then lynx -dump "http://localhost/monitor/gnuplot.php?table=$1&time=year&c1=$2&c2=$3&c3=$4" | gnuplot
  elif date "+%H%M" | grep "0006"
  then lynx -dump "http://localhost/monitor/gnuplot.php?table=$1&time=year&c1=$2&c2=$3&c3=$4" | gnuplot
fi
###
# Adjust the number of elif statements by the number of categories being graphed.
# hour/day/week = 2s/graph set
# +month = 8s/graph set
# +year = 48s/graph set
###

It looks a little complicated, but it's really only generating a graph for each timeframe. The "if" statements allow the month graphs to run only at the top of every hour and the year graphs to be run only at the top of every day. The "elif" statement ensures that, if running multiple graphs, the year graphs will still generate even if delayed. A complete set of graphs takes 48 seconds on my linux host and I'm generating 3 sets (load, disk, interface) -- so 2 minutes should be enough to generate all sets. Since we're already calling loadAverage every 5 minutes, we can add the gnuplot calls there.

uptime | sed -e 's/.* load average: \(.*\), \(.*\), \(.*\)/INSERT INTO loadAverage (load1,load5,load15) VALUES (\1,\2,\3);/' | mysql -u root system
./gnuplot loadAverage load1 load5 load15

Basically, we're done! Every 5 minutes, performance metrics are stored in the database and subsequent graphs are generated for the timeframes specified in the load script. You'll probably want to create a simple webpage to display these graphs. Check the summary to verify your configuration.

Want to graph another category? Create another table and script similar to those named loadAverage above and call the script from monitor. Viola!