Remember the problem where a daemon writes files and rotates them waiting for another daemon/cron to process and remove those files? Well, yeah, icinga core did write perfdata, but npcd did not run in order to populate the pnp rrds. Actually you will recognize that by simply using check_procs on that process, or, check_disk for free space available. The problem is here that the disk space is not an issue – it’s the huge amount of files causing the inode number to be filled up (used 3000000 inodes on my system).

$ df -i
Filesystem                                             Type     Inodes IUsed IFree IUse% Mounted on
rootfs                                                 rootfs     3.0M  3.0M   20K  100% /

While it does make sense to create a check within icinga itsself…

$ /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /

… what if something else runs amok and your system becomes unavailable? There’s a nifty script here to be run manually or via cron, reporting the used inode numbers sorted by the most huge number/directory.

find . -xdev -type f | cut -d "/" -f 2 | sort | uniq -c | sort -n
%d bloggers like this: