Monitoring systems utilize agents running on the monitored system in order to provide a consistent access mechanism to get at host and application data. Hypric has an agent, Zabbix has an agent, Nagios has NRPE as an agent option, collectd has an agent. Maybe its just my flawed perception but when I think of agent based monitoring SNMP isn’t one of the first things that comes to mind. Apparently I am not alone, in his blog on it.toolbox.com Andrew Kramer goes so far as to call SNMP “agentless” monitoring.
Really though, I consider SNMP the most ubiquitous agent. I think it gets overlooked because it is so ubiquitous. SNMP is on network hardware, and ships by default in many operating systems and distributions. Just because its installed by default doesn’t diminish the fact that its a daemon that facilitates data collection in a unified standard way. I mostly run Linux, so I’ll only speak to the net-snmp/ucd-snmp package.
In addition to the standard OIDs that provide access to the process table, memory, and network usage net-snmp provides the exec and the more modern extend options to provide extended capabilities. Exec and extend parameters both execute custom commands. “Note that the “relocatable” form of the ‘exec’ directive (exec OID ….) produces MIB output that is not strictly valid. For this reason, support for this has been deprecated in favour of extend OID … , which produces well-formed MIB results (as well as providing fuller functionality)” [1].
At any rate, its pretty easy to use. Just add a line like one of the following to your /etc/snmp/snmpd.conf or equivalent.
extend yesterday /bin/date --date=yesterday extend sayhi /bin/echo hi extend check_load /usr/lib64/nagios/plugins/check_load -w 2,2,2 -c 4,4,4
If we make these extend commands output in nagios plugin format they can easilybe integrated into nagios, or zenoss. I am sure that other monitoring frameworks support the nagios plugin output format as well.
At one point in the past I had found a nagios check script that was supposed to make it easy to query these extend monitors but I cant remember what I found, or why for whatever reason I couldn’t get it working. Well last night I found a new script (check_snmp_extend.py) on the centreon.com forums (random google result). It’s slightly annoying that registration was required to download the script, but I grabbed it. Luckily I don’t mind hacking on python. I went ahead and created a github repository for check_snmp_extend and fixed up a few things in it. Right now it only works with snmp extend, but adding snmp exec support would be a fairly easy addition I think.
The script makes extending snmp easy to deal with because you don’t need to manage the namespace for unique OIDs since it looks up values based on the name set for the snmp extend command. Based on the above example additions here is some example usage and output.
$ ./check_snmp_extend.py -H test.example.org OK - ok objects: 3, not ok objects: 0 - check_load=OK, echohi=OK, yesterday=OK, $ ./check_snmp_extend.py -H test.example.org -e check_load OK - load average: 0.01, 0.12, 0.07|load1=0.010;2.000;4.000;0; load5=0.120;2.000;4.000;0; load15=0.070;2.000;4.000;0; $ ./check_snmp_extend.py -H test.example.org -e yesterday Wed Apr 27 22:58:20 CDT 2011
Of course it would be best if all of your snmp extends output in the nagios plugin format.