1 # Man page created with:
3 # pod2man -s 8 -r "`./check_openmanage -V | head -n 1`" -c 'Nagios plugin' check_openmanage.pod check_openmanage.8
9 check_openmanage - Nagios plugin for checking the hardware status on
10 Dell servers running OpenManage
14 check_openmanage [I<OPTION>]...
16 check_openmanage -H I<hostname> [I<OPTION>]...
20 check_openmanage is a plugin for Nagios which checks the hardware
21 health of Dell servers running OpenManage Server Administrator
22 (OMSA). The plugin checks the health of the storage subsystem, power
23 supplies, memory modules, temperature probes etc., and gives an alert
24 if any of the components are faulty or operate outside normal
27 check_openmanage is designed to be used by either locally (using NRPE
28 or similar) or remotely (using SNMP). In either mode, the output is
29 (nearly) the same. Note that checking the alert log is not supported
32 =head1 GENERAL OPTIONS
36 =item -t, --timeout I<SECONDS>
38 The number of seconds after which the plugin will abort. Default
39 timeout is 30 seconds if the option is not present.
41 =item -p, --perfdata [I<multline>]
43 Collect performance data. Performance data collected include
44 temperatures (in Celcius) and fan speeds (in rpm). On systems that
45 support it, power consumption is also collected (in Watts).
47 If given the argument C<multiline>, the plugin will output the
48 performance data on multiple lines, for Nagios 3.x and above.
50 =item -w, --warning I<STRING> or I<FILE>
52 Override the machine-default temperature warning thresholds. Syntax is
53 C<id1=max[/min],id2=max[/min],...>. The following example sets warning
54 limits to max 50C for probe 0, and max 45C and min 10C for probe 1:
56 check_openmanage -w 0=50,1=45/10
58 The minimum limit can be omitted, if desired. Most often, you are only
59 interested in setting the maximum thresholds.
61 This parameter can be either a string with the limits, or a file
62 containing the limits string. The option can be specified multiple
65 NOTE: This option should only be used to narrow the field of OK
66 temperatures wrt. the OMSA defaults. To expand the field of OK
67 temperatures, increase the OMSA thresholds. See the plugin web page
70 =item -c, --critical I<STRING> or I<FILE>
72 Override the machine-default temperature critical thresholds. Syntax
73 and behaviour is the same as for warning thresholds described above.
75 =item -o, --ok-info I<NUMBER>
77 This option lets you define how much output you want the plugin to
78 give when everything is OK, i.e. the verbosity level. The default
79 value is 0 (one line of output). The output levels are cumulative.
85 - Only one line (default)
89 - BIOS and firmware info on a separate line
93 - Storage controller and enclosure info on separate lines
97 - OMSA version on separate line
101 The reason that OMSA version is separated from the rest is that
102 finding it requires running a really slow omreport command, when the
103 plugin is run locally via NRPE.
105 =item --omreport I<OMREPORT PATH>
107 Specify full path to omreport, if it is not installed in any of the
108 regular places. Usually this option is only needed on Windows, if
109 omreport is not installed on the C: drive.
113 Prefix any alerts with the service tag.
117 Display a short summary of system information (model and service tag)
120 =item -I, --htmlinfo [I<CODE>]
122 Using this option will make the servicetag and model name into
123 clickable HTML links in the output. The model name link will point to
124 the official Dell documentation for that model, while the servicetag
125 link will point to a website containing support info for that
128 This option takes an optional argument, which should be your country
129 code or C<me> for the middle east. If the country code is omitted the
130 servicetag link will still work, but it will not be speficic for your
131 country or area. Example for Germany:
133 check_openmanage --htmlinfo de
135 If this option is used together with either the I<--extinfo> or
136 I<--info> options, it is particularly useful. Only the most common
137 country codes is supported at this time.
139 =item --postmsg I<STRING> or I<FILE>
141 User specified post message. Useful for displaying arbitrary or
142 various system information at the end of alerts. The argument is
143 either a string with the message, or a file containing that
144 string. You can control the format with the following interpreted
167 Operating system name
171 Operating system release
175 Number of physical drives
179 Number of logical drives
183 Line break. Will be a regular line break if run from a TTY, else an
194 Prefix each alert with its corresponding service state (i.e. warning,
195 critical etc.). This is useful in case of several alerts from the same
198 =item -S, --short-state
200 Same as the B<--state> option above, except that the state is
201 abbreviated to a single letter (W=warning, C=critical etc.).
203 =item --linebreak I<STRING>
205 check_openmanage will sometimes report more than one line, e.g. if
206 there are several alerts. If the script has a TTY, it will use regular
207 linebreaks. If not (which is the case with NRPE) it will use HTML
208 linebreaks. Sometimes it can be useful to control what the plugin uses
209 as a line separator, and this option provides that control.
211 The argument is the exact string to be used as the line
212 separator. There are two exceptions, i.e. two keywords that translates
219 Regular linebreaks, i.e. "\n".
223 HTML linebreaks, i.e. "<br/>".
227 This is a rather special option that is normally not needed. The
228 default behaviour should be sufficient for most users.
232 Debug output. Will report status on everything, even if status is
233 ok. Blacklisted or unchecked components are ignored (i.e. no output).
235 NOTE: This option is intended for diagnostics and debugging purposes
236 only. Do not use this option from within Nagios, i.e. in the Nagios
245 Display version info.
253 =item -H, --hostname I<HOSTNAME>
255 The transport address of the destination SNMP device. Using this
256 option triggers SNMP mode.
258 =item -P, --protocol I<PROTOCOL>
260 SNMP protocol version. This option is optional and expects a digit
261 (i.e. C<1>, C<2> or C<3>) to define the SNMP version. The default is
262 C<2>, i.e. SNMP version 2c.
264 =item -C, --community I<COMMUNITY>
266 This option expects a string that is to be used as the SNMP community
267 name when using SNMP version 1 or 2c. By default the community name
268 is set to C<public> if the option is not present.
272 SNMP port of the remote (monitored) system. Defaults to the well-known
275 =item -U, --username I<SECURITYNAME>
277 [SNMPv3] The User-based Security Model (USM) used by SNMPv3 requires
278 that a securityName be specified. This option is required when using
279 SNMP version 3, and expects a string 1 to 32 octets in lenght.
281 =item --authpassword I<PASSWORD>, --authkey I<KEY>
283 [SNMPv3] By default a securityLevel of C<noAuthNoPriv> is assumed. If
284 the --authpassword option is specified, the securityLevel becomes
285 C<authNoPriv>. The --authpassword option expects a string which is at
286 least 1 octet in length as argument.
288 Optionally, instead of the --authpassword option, the --authkey option
289 can be used so that a plain text password does not have to be
290 specified in a script. The --authkey option expects a hexadecimal
291 string produced by localizing the password with the
292 authoritativeEngineID for the specific destination device. The
293 C<snmpkey> utility included with the Net::SNMP distribution can be
294 used to create the hexadecimal string (see L<snmpkey>).
296 =item --authprotocol I<ALGORITHM>
298 [SNMPv3] Two different hash algorithms are defined by SNMPv3 which can
299 be used by the Security Model for authentication. These algorithms are
300 HMAC-MD5-96 C<MD5> (RFC 1321) and HMAC-SHA-96 C<SHA-1> (NIST FIPS PUB
301 180-1). The default algorithm used by the plugin is HMAC-MD5-96. This
302 behavior can be changed by using this option. The option expects
303 either the string C<md5> or C<sha> to be passed as argument to modify
306 =item --privpassword I<PASSWORD>, --privkey I<KEY>
308 [SNMPv3] By specifying the options --privkey or --privpassword, the
309 securityLevel associated with the object becomes
310 C<authPriv>. According to SNMPv3, privacy requires the use of
311 authentication. Therefore, if either of these two options are present
312 and the --authkey or --authpassword arguments are missing, the
313 creation of the object fails. The --privkey and --privpassword
314 options expect the same input as the --authkey and --authpassword
315 options respectively.
317 =item --privprotocol I<ALGORITHM>
319 [SNMPv3] The User-based Security Model described in RFC 3414 defines a
320 single encryption protocol to be used for privacy. This protocol,
321 CBC-DES C<DES> (NIST FIPS PUB 46-1), is used by default or if the
322 string C<des> is passed to the --privprotocol option. The Net::SNMP
323 module also supports RFC 3826 which describes the use of
324 CFB128-AES-128 C<AES> (NIST FIPS PUB 197) in the USM. The AES
325 encryption protocol can be selected by passing C<aes> or C<aes128> to
326 the --privprotocol option.
328 One of the following arguments are required: des, aes, aes128, 3des,
331 =item --use-get_table
333 This option exists as a workaround when using check_openmanage with
334 SNMPv3 on Windows with net-snmp. Using this option will make
335 check_openmanage use the Net::SNMP function get_table() instead of
336 get_entries() while fetching values via SNMP. The latter is faster and
345 =item -b, --blacklist I<STRING> or I<FILE>
347 Blacklist missing and/or failed components, if you do not plan to fix
348 them. The parameter is either the blacklist string, or a file (that
349 may or may not exist) containing the string. The blacklist string
350 contains component names with component IDs separated by slash
351 (/). Blacklisted components are left unchecked.
353 TIP: Use the option C<-d> (or C<--debug>) to get the blacklist ID for
354 devices. The ID is listed in a separate column in the debug output.
356 NOTE: If blacklisting is in effect, the global health of the system is
363 component1=id1[,id2,...]/component2=id1[,id2,...]/...
365 The ID part can also be C<all>, in which all components of that type
370 check_openmanage -b ps=0/fan=3,5/pdisk=1:0:0:1/ctrl_driver=all
374 In the example we blacklist powersupply 0, fans 3 and 5, physical disk
375 1:0:0:1, and warnings about out-of-date drivers for all
376 controllers. Legal component names include:
382 Storage controller. Note that if a controller is blacklisted, all
383 components on that controller (such as physical and logical drives)
384 are blacklisted as well.
388 Suppress the special warning message about old controller
389 firmware. Use this if you can not or will not upgrade the firmware.
393 Suppress the special warning message about old controller driver.
394 Particularly useful on systems where you can not upgrade the driver.
398 Suppress the special warning message about old Storport driver on
403 This blacklisting keyword exists as a possible workaround for physical
404 drives with bad firmware which makes Openmanage choke. It takes the
405 controller number as argument. Use this option to blacklist all
406 physical drives on a specific controller. This blacklisting keyword is
407 only available in local mode, i.e. not with SNMP.
415 Logical drive (virtual disk)
419 Controller cache battery
423 Ignore warnings related to the controller cache battery charging
424 cycle, which happens approximately every 40 days on Dell servers. Note
425 that using this blacklist keyword makes check_openmanage ignore
426 non-critical cache battery errors.
442 Enclosure power supply
446 Enclosure temperature probe
450 Enclosure management module (EMM)
482 Amperage probe (power consumption monitoring)
496 =item --only I<KEYWORD>
498 This option can be specifed once and expects a keyword. The different
499 keywords and the behaviour of check_openmanage is described below.
505 Print only critical alerts. With this option any warning alerts are
510 Print only warning alerts. With this option any critical alerts are
515 Check all chassis components and nothing else.
523 Only check memory modules
531 Only check power supplies
535 Only check temperatures
539 Only check processors
543 Only check voltage probes
551 Only check power usage
555 Only check chassis intrusion
559 Only check ESM log overall health, i.e. fill grade
563 Only check the event log (ESM) content
567 Only check the alert log content
571 =item --check I<STRING> or I<FILE>
573 This parameter allows you to adjust which components that should be
574 checked at all. This is a rougher approach than blacklisting, which
575 require that you specify component id or index. The parameter should
576 be either a string containing the adjustments, or a file containing
577 the string. No errors are raised if the file does not exist.
579 Note: This option is ignored with alternate basenames.
585 check_openmanage --check storage=0,intrusion=1
589 Legal values are described below, along with the default value.
595 Check storage subsystem (controllers, disks etc.). Default: ON
599 Check memory (dimms). Default: ON
603 Check chassis fans. Default: ON
607 Check power supplies. Default: ON
611 Check temperature sensors. Default: ON
615 Check CPUs. Default: ON
619 Check voltage sensors. Default: ON
623 Check system batteries. Default: ON
627 Check amperage probes. Default: ON
631 Check chassis intrusion. Default: ON
635 Check the ESM log health, i.e. fill grade. Default: ON
639 Check the ESM log content. Default: OFF
643 Check the alert log content. Default: OFF
651 The option C<--debug> (or C<-d>) can be specified to display all
652 monitored components.
656 If SNMP is requested, the perl module Net::SNMP is
657 required. Otherwise, only a regular perl distribution is required to
658 run the script. On the target (monitored) system, Dell Openmanage
659 Server Administrator (OMSA) must be installed and running.
663 If no errors are discovered, a value of 0 (OK) is returned. An exit
664 value of 1 (WARNING) signifies one or more non-critical errors, while
665 2 (CRITICAL) signifies one or more critical errors.
667 The exit value 3 (UNKNOWN) is reserved for errors within the script,
668 or errors getting values from Dell OMSA.
672 Written by Trond H. Amundsen <t.h.amundsen@usit.uio.no>
674 =head1 BUGS AND LIMITATIONS
676 Storage info is not collected or checked on very old PowerEdge models
677 and/or old OMSA versions, due to limitations in OMSA. The overall
678 support on those models/versions by this plugin is not well tested.
680 =head1 INCOMPATIBILITIES
682 The plugin should work with the Nagios embedded perl interpreter
683 (ePN). However, this is not thoroughly tested.
685 =head1 REPORTING BUGS
687 Report bugs to <t.h.amundsen@usit.uio.no>
689 =head1 LICENSE AND COPYRIGHT
691 This program is free software: you can redistribute it and/or modify
692 it under the terms of the GNU General Public License as published by
693 the Free Software Foundation, either version 3 of the License, or (at
694 your option) any later version.
696 This program is distributed in the hope that it will be useful, but
697 WITHOUT ANY WARRANTY; without even the implied warranty of
698 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
699 General Public License for more details.
701 You should have received a copy of the GNU General Public License
702 along with this program. If not, see L<http://www.gnu.org/licenses/>.
706 L<http://folk.uio.no/trondham/software/check_openmanage.html>