Enterprise Level Device Availability Report in DX Infrastructure Manager 9.1

by June 11, 2019

The job of a system administrator is to make sure the data center is always available for serving the business services IT supports. If device availability is the key KPI that the IT operator, system administrator or cloud administrator is tracking, there is a crucial need for a device availability report that tracks physical, virtual or cloud devices at the enterprise level.

The device availability report is composed of two parts. The first part indicates whether the device is powered up or not, i.e. system uptime. The second is device reachability, which determines whether the device is connected to the network and is reachable from endpoints. Many times, IT administrators only focus on reachability, or in other words, connectivity. If the device is not reachable, they tend to mark the device as unavailable. This method is incorrect for cases where the user is not supposed to or allowed to connect to certain devices in the data center. In these cases, if the device is powered up, it should be marked as available, where reachability is secondary.

Both arguments are correct based on the situation and the role of device. For catering to both schools of thought, the DX Infrastructure Manager (formerly CA Unified Infrastructure Management) device availability report shows both the availability percentage and reachability percentage.

Availability Percentage is defined as a percentage of time in a selected period where the device is powered up.

Reachability Percentage is defined as a percentage of time in a selected period where the device is connected and accessible from the endpoints.

Devices can also have planned/scheduled maintenance, and therefore availability and reachability percent is calculated considering this maintenance time so that the calculation is accurate.

In a hyper-converged data center, there can be three types of devices, e.g. physical, virtual and cloud compute instances. For each of these device types, availability percent is measured using device specific metrics collected by DX Infrastructure Manager.

For traditional physical devices that have the DX Infrastructure Manager robot or agent deployed on the device, availability is based on the robot. The logic here is if a robot is up and responding to the hub, then the device is up. If the Sys Uptime % QoS does not accurately measure when the device is down, that metric is not reported. This causes inaccuracy in the availability calculation. The calculation based on robot availability is full proof since DX Infrastructure Manager expects that the agent/robot is enabled if the device is up. In DX Infrastructure Manager, device availability is calculated using the newer hub level metric added with hub 7.96 and later.

For cloud compute or virtual devices, availability is calculated using the metric provided by OEM itself. For cloud and virtual devices, the power state (device up/down status) is calculated using the following metrics:

  1. Azure – QOS_AZURE_INSTANCE_POWER_STATE
  2. AWS – QOS_AWS_INSTANCE_POWER_STATE
  3. VMware – QOS_VMWARE_INSTANCE_POWER_STATE
  4. Nutanix – QOS_NUTANIX_INSTANCE_POWER_STATE
  5. Microsoft HyperV – QOS_HYPERV_INSTANCE_POWER_STATE
  6. OpenStack – QOS_OPENSTACK_INSTANCE_POWER_STATE

Reachability is calculated using the QoS QOS_SERVICE_AVAILABILITY fetched through the ICMP ping probe, which runs the ping test to measure device connectivity.

Figure 1: Availability report for physical devices.

Putting availability and reachability together gives comprehensive reporting for device.

Since you may need to monitor thousands of devices, DX Infrastructure Manager provides the option to group devices. In big organizations, there is a dedicated resource pool that is responsible for the upkeep and the availability of devices in particular business groups.  Therefore, DX Infrastructure Manager gives you the ability to generate availability reports at a group level. The report also gives you the average availability of devices in the group.

Figure 2: Group level availability report for physical, virtual, and cloud compute devices.

If the group contains a mix of devices, then the group level report will have a separate section for each of the devices.

These reports are generated using the Business Intelligence reporting tool, and therefore they can be generated on a predetermined schedule and can be exported in multiple formats. For configuration and information on  accessing the reports, refer to the user documentation. You can access the the device availability report from the reports section of the operator console in DX Infrastructure Manager 9.1