Installing and Configuring Nagios Core on Linux

0
12722
nagios

nagios

Here’s a comprehensive guide on the Nagios network and system monitoring software application, which covers the installation and configuration of Nagios Core on a Linux system. It also includes the creation of custom Nagios plugins using the Bash script.

A network monitoring system is an essential constituent of any enterprise, small or large. A substantial amount of money is spent on establishing the entire network infrastructure, mainly servers and network components, expanding it as per changing needs and most important, keeping it alive all the time. With the network infrastructure constantly growing, it becomes a tedious job for network and systems administrators to continuously keep an eye on every device, all the time. Hence, there’s a need for a network monitoring system that does this job for them.
A network monitoring system can show the live statuses of all the devices – whether they are up or down, the resources being consumed by them, services running on them, etc – in one place. Moreover, it notifies the administrators immediately in case of a malfunction by sending them emails, SMSs or even phone calls, so that admins are always aware of what has gone wrong and at which point in the network.
When it comes to selecting the best monitoring tool, there are many alternatives to choose from, but Nagios is one of the finest options. It is available in two versions – Nagios Core, which is open source and free, and Nagios XI, which comes with some added features and capabilities, but needs to be purchased. However, both deal with monitoring critical servers and network devices, both provide indications of problematic situations, and both report the issues to the admins through mails and/or messages to help minimise downtime, reduce commercial losses and improve SLA.
This article is about installing and configuring Nagios Core on a Linux system. To demonstrate this, I have used the framework mentioned below:
Nagios server: 192.168.0.109
Operating system: Red Hat Enterprise Linux 6.5
Nagios client: 192.168.0.108
Operating system: Red Hat Enterprise Linux 6.5
Nagios Core version: 4.0.8
Nagios plugins version: 2.0.3

Nagios Core installation
Installing the necessary packages and their dependencies:The required packages mainly include Apache, PHP, C compilers, C libraries and development libraries. These packages can be installed using the Yum install as shown below:

$ yum install wget httpd php gcc glibc glibc-common gd gd-devel make net-snmp

In order to make sure that the service ‘httpd’ starts after every system reboot, it should be added to the startup as follows:

$ chkconfig httpd on
$ service httpd start

Downloading the Nagios Core and Nagios plugins packages: After the installation of the required packages, it’s time to download Nagios Core and Nagios plugins, which are available in the form of tarballs, from the official website.

$ cd /root
$ wget http://prdownloads.sourceforge.net/sourceforge/nagios/nagios-4.0.8.tar.gz
$ wget http://nagios-plugins.org/download/nagios-plugins-2.0.3.tar.gz

Adding the ‘nagios’ user and ‘nagios’ group: In order to submit and process external commands from the graphical user interface, we need to create a group, say ‘nagios’, and add a user, say ‘nagios’, to that group.

$ groupadd nagios
$ useradd –g nagios nagios

Nagios Core installation: Having downloaded the Nagios Core tarball file from the source, we will have to extract it in order to proceed with the installation.

$ tar –xvzf nagios-4.0.8.tar.gz

Change to the nagios-4.0.8 directory to begin the configuration part. After executing the ‘configure’ script, we have to compile the source code and then install it.

$ cd nagios-4.0.8
$ ./configure --with-command-group=nagios
$ make all
$ make install
$ make install-init
$ make install-config
$ make install-commandmode
$ make install-webconf

Now, restart the httpd service for the changes to take effect.

$ service httpd restart

To make sure that the ‘nagios’ service starts after every system reboot, we must add it to the startup services as follows:

$ chkconfig nagios on

Nagios plugins installation: In order to monitor remote hosts and the services running on them, we need to install Nagios plugins. For this, we need to change to the ‘nagios-plugins-2.0.3’ directory, run the ‘configure’ script, make and install the binaries.

$ cd nagios-plugins-2.0.3
$ ./configure --with-nagios-user=nagios --with-nagios-group=nagios
$ make
$ make install
1-Nagios-Web_Access
Figure 1: Nagios web console

Nagios Web access: We are almost done with the installation part and to verify if it is correct, you can open your Web browser and enter http://<NAGIOS-SERVER-IP-ADDRESS>/nagios

You will find a dialogue box asking for your credentials in order to access the Nagios Web interface (Figure 1).
So far, we’ve not created any login to access the Nagios Web interface. Let’s create it now.

$ htpasswd -c /usr/local/nagios/etc/htpasswd.users <USERNAME>

You’ll be asked to enter the password and then confirm it before a login is created.
If you try logging in with the newly created credentials, you might encounter the following error displayed on the Nagios Core Web console: ‘Unable to get process status’ (Figure 2).

2-Nagios-Unable-to-get-process-status
Figure 2: Nagios unable to get process status

This error is displayed as SELinux is enabled on the system. It can be verified as shown below:

$ getenforce
Enforcing

In order to disable SELinux, you can execute the following command, verify the status again and restart the Nagios service:

$ setenforce 0
$ getenforce
Permissive
$ service nagios restart

And the outcome is shown in Figure 3.

3-Nagios-Daemon-running
Figure 3: Nagios Daemon running

Again, you can verify the Nagios configuration file for its correctness using the following command:

/usr/local/nagios/bin/nagios -v /usr/local/
nagios/etc/nagios.cfg

Nagios Core configuration
We have now installed Nagios Core and we can access it from a Web console. On the left pane, a link named ‘Hosts’ can be observed, which when clicked on displays the list of hosts currently being monitored (Figure 4).
Clearly, we have not added any hosts to Nagios, so it will just show ‘localhost’ in the list of hosts being monitored.
The same can be observed in the ‘Host Status Details’ shown in Figure 5. It gives us a brief How To of all the hosts configured to monitoring Nagios and their statuses – the total number of hosts, the number of hosts that are up/down/unreachable, etc.

4-Nagios-Hosts
Figure 4: Nagios hosts
5-Nagios-Host-Status-Details
Figure 5: Nagios host status details

NRPE installation on a Nagios client: In order to monitor any Linux hosts using Nagios, we need to install the NRPE (Nagios Remote Plugin Executor) client on the host, so that the Nagios server can execute Nagios plugins remotely to fetch the host/service information.
To proceed with the NRPE installation, we need to download the NRPE tarball from the sources, as follows:

$ wget http://liquidtelecom.dl.sourceforge.net/project/nagios/nrpe-2.x/nrpe-2.15/nrpe-2.15.tar.gz

In the next step, we have to extract the tarball, compile the binaries and install them. Prior to that, make sure that ‘xinetd’ is installed in the system; otherwise, install it manually using Yum.

$ yum install xinetd
$ tar –xvzf nrpe-2.15.tar.gz
$ cd nrpe-2.15
$ ./configure
$ make all
$ make install-plugin
$ make install-daemon
$ make install-daemon-config
$ make install-xinetd

In the next step, we need to open port 5666 for the NRPE service.

$ echo “nrpe 5666/tcp # NRPE” >> /etc/services

In the last step, we need to allow requests from the Nagios server. For this, edit the /etc/xinetd.d/nrpe file and mention the IP address of the Nagios server in front of ‘only_from’ as follows:

only_from = 127.0.0.1, <NAGIOS-SERVER-IP-ADDRESS>

For the changes to take effect, restart the xinetd service as follows:

$ service xinetd restart

To verify the installation of NRPE on the Nagios client, execute the following command from the Nagios server:

$ /usr/local/nagios/libexec/check_nrpe –H <REMOTE-HOST-IP-ADDRESS>
NRPE v2.15

The output should display the NRPE version installed on the remote host, as shown above.

Adding hosts to Nagios monitoring: In order to add new servers to Nagios monitoring, let’s create a directory with name servers inside the /usr/local/nagios/etc/objects directory. Configuration files for all the servers we add will reside in this directory.

$ mkdir /usr/local/nagios/etc/objects/servers

In order to add this directory in the configuration file, edit the /usr/local/nagios/etc/nagios.cfg file and uncomment the following line:

# cfg_dir=/usr/local/nagios/etc/servers

Now, change to the ‘servers’ directory and create a new configuration file for our new Nagios client. Add the following details to the configuration file, for which you can take help from the templates.cfg file:

$ cd usr/local/nagios/etc/servers
$ vi MyLinuxServer.cfg
define host {
name MyLinuxServer
address 192.168.0.110
use generic-host
check_period 24x7
check_interval 5
retry_interval 1
max_check_attempts 5
check_command check-host-alive
notification_period workhours
notification_interval 60
contact nagiosadmin
register 1
}

Adding services for the hosts: To add any service to the hosts, one might need to create a command to monitor that service. This can be done by editing the /usr/local/nagios/etc/objects/commands.cfg file.

define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

Similar to the hosts, we should create a directory called ‘services’ in order to keep custom services configuration files. Again, the necessary changes should be made to the /usr/local/nagios/etc/nagios.cfg file.

$ mkdir /usr/local/nagios/etc/objects/services
$ echo “cfg_dir=/usr/local/nagios/etc/services” >> /usr/local/nagios/etc/nagios.cfg

To add services for the host MyLinuxServer, let’s create a configuration file MyLinuxServer.cfg in the services directory as shown below:

$ cd /usr/local/nagios/etc/objects/services
$ vi MyLinuxServer.cfg
define service {
use generic-service
host_name MyLinuxServer
service_description Server Uptime
check_command check_nrpe!check_uptime
notifications_enabled 1
}

Note: Whenever changes are made to the /usr/local/nagios/etc/nagios.cfg file, execute the following command to check for any errors in the configuration:

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

After adding the host to the Nagios configuration, restart the Nagios service once on the Nagios server, and observe the changes in the Nagios Web console (Figure 6).
In a similar manner, many more hosts and services can be added to the Nagios configuration.

6-Nagios-Host-Added
Figure 6: Nagios host added

Creating custom Nagios plugins using the Bash script:

Even though the Nagios plugins that we have already installed offer a wide variety, one might need to create one’s own Nagios plugin in order to monitor a specific service. The Nagios plugin supports a wide range of programming languages including Bash, Python, Perl, C/C++, etc. But one common fact that needs to be remembered while creating a Nagios plugin using any of the mentioned programming languages is the syntax of the output, which is as follows:

<STATUS> - <DATA TO BE DISPLAYED ON WEB CONSOLE> | <DATA NEEDED FOR GENERATING GRAPHS>

A service’s ‘STATUS’ can be ‘OK’, ‘WARNING’, ‘CRITICAL’, or ‘UNKNOWN’, which entirely depends upon the corresponding threshold values defined in the script.
Exit status 0 denotes an ‘OK’ status for the service, which is responsible for highlighting the service check with green on the Nagios Web console. Similarly, exit status 1 denotes the ‘WARNING’ status of the service highlighted with yellow; exit status 2 denotes the ‘CRITICAL’ service status highlighted with red; and exit status 3 denotes the ‘UNKNOWN’ service status highlighted with grey.
Consider that we have to create a custom Nagios plugin to monitor the load average on a Linux server. For this, we need to decide the WARNING and CRITICAL thresholds, which are usually 80 per cent and 90 per cent, respectively, of the number of processor cores available in the system.
The live status of the load average can be obtained from the ‘uptime’ command, which can then be compared with the threshold levels, to decide the status of the service. Thresholds can be set using the /proc/cpuinfo file, which possesses the necessary information related to the processor installed in the system. There are some conditional statements to compare live values with the threshold values. The echo statements display the service check in the proper format, and a few exit commands help to colour the service check on the Nagios Web console, accordingly.
The sample Bash script will look like what’s shown below:

#!/bin/bash

UPTIME=`uptime | awk -F “average:” ‘{print $2}’`
load1=`echo $UPTIME | awk -F, ‘{print $1}’ | xargs`
load5=`echo $UPTIME | awk -F, ‘{print $2}’ | xargs`
load15=`echo $UPTIME | awk -F, ‘{print $3}’ | xargs`

intload1=`echo “scale=1; $load1*100” | bc -l | cut -d “.” -f 1`
intload5=`echo “scale=1; $load5*100” | bc -l | cut -d “.” -f 1`
intload15=`echo “scale=1; $load15*100” | bc -l | cut -d “.” -f 1`

Nprocs=`grep “processor” /proc/cpuinfo | wc -l`

warn=`scale=1; echo “80 * $Nprocs” | bc -l | cut -d “.” -f 1`
crit=`scale=1; echo “90 * $Nprocs” | bc -l | cut -d “.” -f 1`

actwarn=`echo “scale=1; $warn/100” | bc -l`
actcrit=`echo “scale=1; $crit/100” | bc -l`

if [ “$intload1” -le “$warn” -a “$intload5” -le “$warn” -a “$intload15” -le “$warn” ]
then
STATUS=”OK”
EXIT=”0”
elif [ “$intload1” -gt “$warn” -o “$intload5” -gt “$warn” -o “$intload15” -gt “$warn” ]
then
if [ “$intload1” -gt “$crit” -o “$intload5” -gt “$crit” -o “$intload15” -gt “$crit” ]
then
STATUS=”CRITICAL”
EXIT=”2”
else
STATUS=”WARNING”
EXIT=”1”
fi
else
STATUS=”UNKNOWN”
EXIT=”3”
fi

echo “$STATUS- Load Average: $load1, $load5, $load15 |
load1=$load1;$actwarn;$actcrit;; load5=$load5;$actwarn;$actcrit;;
load15=$load15;$actwarn;$actcrit;;” && exit $EXIT

When the script is created, it has to be put in the /usr/local/nagios/libexec directory of the Nagios server as well as the clients to be monitored.
To create a command using the custom script, the commands.cfg file needs to be edited.

$ vi /usr/local/nagios/etc/objects/commands.cfg
define command{
command_name check_load_average
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c load_average
}

Now, add the above command to the remote host’s service configuration file.

$ vi /usr/local/nagios/etc/objects/services/MyLinuxServer.cfg
define service {
use generic-service
host_name MyLinuxServer
service_description Load Average
check_command check_load_average
}

Restart the Nagios service and perform a check from the command line:

$ /usr/local/nagios/libexec/check_nrpe -H 192.168.0.108 -c load_average

Your custom Nagios plugin has been created.

LEAVE A REPLY

Please enter your comment!
Please enter your name here