Monitoring vhosts with Icinga 2 and Icinga Web 2

Since Icinga 2 runs stable and is considered mature (1.x still works, but is a pain to configure) I’ve taken 2015 for starting to use it on my productive systems. The server which hosts legendiary.at also serves several other vhosts, overall it is hosted in the NETWAYS cloud (thanks!), one situation where you love your managed service guys & Opennebula, Foreman, Puppet.

Afterall I wanted to monitor at least some web and dns services for all vhosts (more to come though). Web services are just a simple http reachability check, while dns verifies that the A record of the given vhost also returns the server’s ip address. That is mandatory to see whether the dns records are still intact, or some misconfiguration is happening.

Therefore I’ve created a file ‘hosts.conf’ containing one host so far, introducing the ‘vhosts’ dictionary.

# cat hosts.conf

object Host "srv-mfriedrich" {
  check_command = "hostalive"

  address = "185.11.252.96"

  vars.vhosts["www.legendiary.at"] = {
  }
  vars.vhosts["www.teamobsession.at"] = {
  }
  vars.vhosts["www.freerunningacademy.at"] = {
  }
}

The services.conf file is rather generic – it takes all hosts with the custom attribute ‘vhosts’ and loops over that dictionary, creating new service object. Each service is prefixed with either “http-” or “dns-” depending on the generated check then. Read more about apply-for-rules.

I’m using the shipped Icinga 2 plugin check commands “http” and “dns” and only set the expected custom attributes.

 # cat services.conf

apply Service "http-" for (http_vhost => config in host.vars.vhosts) {
  import "generic-service"

  check_command = "http"

  vars += config
  vars.http_vhost = http_vhost

  notes = "HTTP checks for " + http_vhost

  assign where host.vars.vhosts
}

apply Service "dns-" for (dns_lookup => config in host.vars.vhosts) {
  import "generic-service"

  check_command = "dns"

  vars += config
  vars.dns_lookup = dns_lookup

  notes = "DNS checks for " + dns_lookup

  assign where host.vars.vhosts
}

Note: Apply For was introduced in Icinga 2 v2.2.0 and is the preferred way of configuring the magic stuff.

icinga2_legendiary_01 icinga2_legendiary_02 icinga2_legendiary_03

Added some notifications and users, in that example it’s just everything for vhosts and myself. I don’t like to get notified when someone forgets to configure the ‘address’ attribute required for the checks, so the notifications are not generated for these objects.

The ‘mail-{host,service}-notification’ templates are shipped with Icinga 2 in conf.d/templates.conf, similar to ‘generic-{host,service-user}’ templates. The notification templates do reference the mail notification command, but I don’t really care about it. The only important thing is to set the User object’s ’email’ attribute.

# cat users.conf 
object User "michi" {
  import "generic-user"

  email = "michael.friedrich@gmail.com"
}

# cat notifications.conf 

apply Notification "vhost-mail-host" to Host {
  import "mail-host-notification"

  users = [ "michi" ]

  assign where host.vars.vhosts
  ignore where !host.address //prevent wrong configuration being notified
}


apply Notification "vhost-mail-service" to Service {
  import "mail-service-notification"

  users = [ "michi" ]

  assign where host.vars.vhosts
  ignore where !host.address //prevent wrong configuration being notified
}

icinga2_legendiary_04Using the new dynamic Icinga 2 language could become rather complex. But still, for simple vhost monitoring it even saves you a lot of typing. And keep in mind – the notification rules are applied based on patterns. I don’t have to worry about contacts assign to hosts/services as I always struggled with in Icinga 1.x or Nagios.

In the end, I’m also using Icinga Web 2‘s git master. It’s still beta, but works far better than Classic UI or Web 1.x. So you’ll see, it’s time for Icinga 2* and a bright future. Next up – Graphite, Graylog2 and automated Puppet deployments of remote checker clients/satellites.

Playing with Icinga 2 and graphite

If you’ve attended the OSMC 2013 and the Icinga presentation you might have seen it already, but for all new readers – Icinga 2 got native support for writing metrics to graphite carbon-cache. There’s not much to do than

  • have Icinga 2 installed & some checks configured
  • have graphite up & running
  • enable the GraphiteWriter feature

I’m using a Vagrant box for graphite where I am running a puppet module to install graphite from sources, but patching it for realtime performance – so you might assign that a little more disk space then.

The Icinga 2 Vagrant box will install the latest and greatest snapshot rpms built from git next, so we are bleeding edge here – if you encounter any bugs, please report them to https://dev.icinga.org

The graphite vagrant box will listen on the forwarded port 20003 on localhost’s ip address. Feel free to modify the virtualbox portforwarding though – it’s just a different port not to harm any local installs.

Now get into the Icinga 2 Vagrant box and enable the GraphiteWriter feature.

$ vagrant ssh
$ sudo -i
# icinga2 feature enable graphite

Now uncomment host and port, and modify it to your carbon cache listener. Restart Icinga 2 to apply changes.

# vim /etc/icinga2/features-available/graphite.conf

/**
 * The GraphiteWriter type writes check result metrics and
 * performance data to a graphite tcp socket.
 */

library "perfdata"

object GraphiteWriter "graphite" {
  host = "192.168.2.101",
  port = 20003
}

# service icinga2 restart

The Vagrant graphite box is accessible at http://localhost:8081.

Home exercise: Set “check_interval = 1s” in your services, and watch graphite in realtime (patched auto-refresh). If you need some detailed insights on graphite itself, you may checkout my employer’s trainings.

Icinga 2 release steps for 0.x.y

$ cd icinga2-release/
$ git checkout next
$ git checkout master
$ git fetch
$ git merge origin/next
$ git push
$ cd ..
$ rm -rf release/ ; mkdir release
$ cd release/
$ cmake ../icinga2-release -DCPACK_SOURCE_GENERATOR=TGZ -DCPACK_SOURCE_PACKAGE_FILE_NAME=icinga2-0.0.6
$ make package_source
$ tar ztf icinga2-0.0.6.tar.gz | less
$ tar zfx icinga2-0.0.6.tar.gz -C ../
$ cd ../icinga2-0.0.6/
$ icinga2_normal
$ sudo /usr/sbin/icinga2 --help
$ sudo /usr/sbin/icinga2 --version

the inode problem

Remember the problem where a daemon writes files and rotates them waiting for another daemon/cron to process and remove those files? Well, yeah, icinga core did write perfdata, but npcd did not run in order to populate the pnp rrds. Actually you will recognize that by simply using check_procs on that process, or, check_disk for free space available. The problem is here that the disk space is not an issue – it’s the huge amount of files causing the inode number to be filled up (used 3000000 inodes on my system).

$ df -i
Filesystem                                             Type     Inodes IUsed IFree IUse% Mounted on
rootfs                                                 rootfs     3.0M  3.0M   20K  100% /

While it does make sense to create a check within icinga itsself…

$ /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /

… what if something else runs amok and your system becomes unavailable? There’s a nifty script here to be run manually or via cron, reporting the used inode numbers sorted by the most huge number/directory.

find . -xdev -type f | cut -d "/" -f 2 | sort | uniq -c | sort -n

mkfifo, group write perms and umask

Ever wondered why mkfifo accepts group write permissions, but file gets created with 0644? umask is the root of all evil. Once reset, don’t forget to save the old mask and restore it later on. Details on the issue in Icinga2 #4444

        /*
         * process would override group write permissions
         * so reset them. man 3 mkfifo: (mode & ~umask)
         */
        mode_t oldMask = umask(S_IWOTH);

        if (!fifo_ok && mkfifo(commandPath.CStr(), S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP) < 0) {
                BOOST_THROW_EXCEPTION(posix_error()
                    << boost::errinfo_api_function("mkfifo")
                    << boost::errinfo_errno(errno)
                    << boost::errinfo_file_name(commandPath));
        }

        /* restore old umask */
        umask(oldMask);

Test coming Icinga 1.10 Classic UI Filters from GIT next

I’m currently developing Icinga2 @netways (when not on vacation) and therefore we’re required to visualize Icinga2’s output. Grepping status.dat isn’t a very pleasant task here either.

Since Ricardo did a magnificant job on adding on-demand filters to classic ui, you may want to test that yourself side by side to production enviroments and enjoy new features as much as we do (and focus on the real stuff).

Installing Icinga Classic UI from GIT next is presumingly easy – just follow the wiki entry I keep writing for that reason.

That way, you can enjoy the new feature set while not touching the existing installations. And even better, decide to use that as your standalone dashboard later on once Icinga 1.10 is officially released. As usual, send bug reports to https://dev.icinga.org 🙂