Monitor Your Data Center’s Health over IPMI Using CA Unified Infrastructure Management

by April 30, 2019

The Intelligent Platform Management Interface (IPMI) is an abstract, standardized, and message-based interface for hardware-based platform management systems, making it possible to control and monitor servers centrally (Figure 1) . IPMI operates independently of the operating system (OS) to allow administrators to manage a system remotely, even in the absence of an operating system or of the system management software. This makes it an ideal choice for hardware health and failure monitoring. Originally developed by Intel in the late 1990s (in cooperation with Dell, Hewlett Packard and NEC), IPMI has grown to become an industry standard interface, supported across 200+ hardware vendors.

Figure 1: Interfaces to the baseboard management controller (BMC)

CA Unified Infrastructure Management (UIM) provides a comprehensive solution for data center monitoring using the CA ecoMeter probe, retrieving energy, power and other hardware health data from the target devices. The CA ecoMeter probe collects health information from devices using native protocols (like IPMI, BACnet, Modbus, SNMP, WMI, RF Code and OPC) (Figure 2), and unifies this information using its internally stored MIB tables.

Figure 2: IPMI software stack

In this post, we will outline a simple, step-by-step process to configure and use the CA ecoMeter probe in your data center environment for IPMI-based devices, so you can start monitoring their health metrics effectively with a minimal amount of configuration.

Step 1: Prerequisites

One of the key prerequisites for connecting the ecoMeter probe to the target device is to enable IPMI-over-LAN on your device native configuration. In addition, you will need to provide the IPMI user with administrator privileges, with access to all the relevant IPMI roles. Please look up your target device’s vendor documentation to find the exact settings. For example, for the Dell iDRAC 6 device family, the setting can be found under Remote Access > Network/Security > IPMI Settings (Figure 3 and Figure 4):

Figure 3: Accessing IPMI settings with Dell iDRAC
Figure 4: Assigning user privileges with Dell iDRAC

Next, configure the correct authentication type in the CA ecoMeter probe (Figure 5), which will be used to connect to the target system. “Simple” authentication type is recommended for automatically discovering a device, and “RMCPPLUS” is recommended for adding devices manually (navigate to Instances > Add Device in the CA ecoMeter UI).

Figure 5: Configure authentication settings with the ecoMeter probe

To find all supported authentication types on your target device, you can issue the following command from any connected Linux/Unix console that has access to the Dell iDRAC environment. Please note you’ll need the OpenIPMI package for required command binaries and libs for this operation.

ipmitool lan -H <IDRAC-IP> -U <username> -P <pwd> print 1

Once these steps are successfully completed, we can proceed to configure the devices for individual health metrics and parameters.

Step 2: Define and Collect Health Metrics

Once you successfully connect to the device IPMI manager with the correct authorization and required privileges, the manager utilizes IPMI over IP to connect with the BMC on the server motherboard. Then, the BMC uses the system bus to connect with the BIOS, CPU, OS, power supply and sensors, allowing the administration of CPU speeds, fan speeds, voltages, temperatures, event log, and rebooting of the server.

You can easily add multiple metrics for each sensor family in the ecoMeter probe and start monitoring them immediately, as shown below (Figure 6):

Figure 6 Select metrics for monitoring

With IPMI being the protocol of choice across hardware vendors, CA UIM and the ecoMeter probe can support your hardware health monitoring with its embedded support for IPMI, backed by easy configuration and management options. To learn more about the ecoMeter probe and how it can augment your data center monitoring, please refer to our product documentation.