After a brief detour, it's time to get back to monitoring. This month, I'm looking at Nagios. Nagios is "an open source host, service and network monitoring program," according to the Nagios Web site. That's as good a description as any.
Nagios is a fairly popular monitoring system and is scalable up to hundreds of hosts. My needs are a bit more modest, but I've seen Nagios deployed to monitor quite a few systems.
Nagios source can be found at http://www.nagios.org/download/. There are also RPMs for Red Hat and tarballs and RPMs for Nagios plug-ins on the site.
If you're not running Red Hat, don't fret — most distributions will have packages pre-compiled if you don't feel like compiling Nagios from source. Note that I'm focusing on the Nagios 1.3 release here, because Nagios 2.0 is still a beta release. I'm using the Debian testing packages and have also tried the packages on SUSE Professional 9.2.
If you'd like to install Nagios from source, you can find the instructions on the Nagios site.
Nagios configuration is a bit of a chore, actually. There are several configuration files to contend with, but after you've got everything sorted out it's not too bad. Once you have Nagios configured to your liking, you won't need to go in and fuss with the configuration files again until you add new hosts, change service monitoring, or change users for notifications.
The standard location for Nagios configuration files is under /usr/local/nagios/etc when installing from source, but most Linux distros (at least, the ones that I've tried) will put Nagios configuration files under /etc/nagios.
The first file you will work with is the main Nagios configuration file. Note that you could keep all of the Nagios configuration directives in one big file, but it's pretty common to split the directives out among assorted files. Also, this may be done differently depending on whether you're getting Nagios from the Nagios source or from a distribution's packages. I'm working with packages from Debian testing, so your mileage may vary if you're working with Nagios from SUSE packages or some other distro. Note that SUSE and Debian have largely the same configuration files.
The main configuration file specifies the location of other config files, log files, configuration directories, the Nagios system user name and group, lock file, temp file, and so forth. See the Nagios documentation for a full list of configuration directives and options.
The main configuration file is also the file where you can specify whether you want Nagios to enable "flap detection." Basically, "flap detection" detects whether a host or service is "flapping" — which is to say, rapidly changing between states. For example, a service would be "flapping" if it fails several checks, then passes, then fails again, then passes, etc. I usually enable this, since "flapping" is usually a danger sign in and of itself.
Next, you'll want to edit the CGI configuration (cgi.cfg) to specify where your files are and which users are able to access various sections of the Nagios Web-based interface. For example, you may want to allow a user to check service and host details under Nagios without allowing the user to view the Nagios configuration. There are several directives — the default syntax is "directive=value" — so, for example, the "authorized_for_all_services" directive would be specified like this:
Note that there is no space between the two values.
The cgi.cfg also specifies the location for the main Nagios configuration file, the path to the HTML files for Nagios, PING syntax, and so forth.
On Debian, access is controlled by htaccess; by default, it's /etc/nagios/htpasswd.users. If you want to add users, you can use htpasswd to generate additional entries. For example:
htpasswd -nb username password
will print a username and password to standard out so you can cut and paste the password into the htpasswd.users file. Or use "htpasswd htpasswd.users username" to be prompted for a password, and the file will be updated automagically.
Next, we'll configure Nagios so that you can reach its Web-based interface. You'll probably need to edit the Apache configuration file to add the Nagios section, or simply add an include to point to the default configuration. On Debian, it's not added by default. The conf file can be found under /etc/nagios/apache.conf. To add this to your Apache configuration, the best way to go is usually just to add an Include directive, like so:
You can achieve the same effect by including the entire file in your httpd.conf, but I prefer to use the include directive.
Here is a sample host configuration that defines a default set of services, and a specific host:
Note that "1" indicates a service or check is enabled; "0" indicates it is disabled. The first entry defines a default set of rules that can be inherited by other hosts. The second entry defines a specific host. For the most part, these configuration options are self-explanatory — host_name is the name of a host that's being monitored; notification_period is when you'd want notifications sent out. (See the docs for specifications on other time intervals.)
The notification options are "d" for "down," "u" for unreachable, and "r" for recovery. If you want no notifications, use "n" — but there's usually not much point in having Nagios configured to monitor a server and not send notifications. (However, it's useful if you've configured a server in Nagios and it's not deployed yet.)
See the Nagios documentation for more information on host entries.
Nagios's Web-based interface allows you to monitor hosts and services when you happen to be in front of a computer — but what happens when everyone's out of the office? Nagios will email and page admins to alert them to problems. While this isn't terribly fun when you're on the receiving end of the pages, it's a necessary evil. To configure a contact, follow this basic template:
Note that you can omit the "host-notify-by-epager" if your admin(s) does not have a pager. You can also set up contact groups, assuming you have multiple admins or perhaps different admins for different departments and so on. For example, you might set up a group for Linux servers and a different group for Solaris servers.
If you're so inclined, you can also edit the look and feel of Nagios. On Debian, the stylesheets that control the Nagios interface are found under /etc/nagios/stylesheets. These are standard CSS stylesheets, so you can tweak the look and feel of the interface and even bring it in line with your corporate identity if necessary. I'm perfectly happy with the Nagios look and feel, so I've never bothered to do so myself.
Going further with Nagios
For delving deeper with Nagios, you might look into the Nagios FAQs and the Nagios mailing lists. For distro-specific questions, try your distribution's mailing lists as well. Don't forget the handy "Documentation" link in the Nagios interface!
Nagios doesn't have a Web-based management interface, but there are a few projects that provide one if you're looking for that sort of thing. Nagat is a PHP interface for managing Nagios. Nagmin is a plugin for Webmin that can be used to manage Nagios, and Nagiosweb is a PHP/MySQL frontend for Nagios configuration.
Questions or problems regarding this web site should be directed to email@example.com.
Copyright © 2008 Art Beckman. All rights reserved.
Last Modified: March 9, 2008