REDIS Datastore Monitoring through CA Unified Infrastructure Management

by March 6, 2019

Redis is one of the most popular in-memory databases and is well known for its high performance and capabilities for querying, replication, high-availability and automatic partitioning. It supports different data structures, including strings, lists, maps, sets, streams and spatial indexes, etc. Redis is well suited for scenarios where there is a need for processing high-volume traffic from multiple sources and complex data sets in near real time.

Typically, Redis is used as a memory cache, message queue and database. Redis also has a built-in replication mechanism among nodes providing high availability and automatic partitioning using Redis-cluster. The Redis implementation makes heavy use of the system calls where the child process takes care of writing data to persistence storage while the parent process continues to service clients.

Since Redis can provide a large volume of metrics, it is critical to choose the right set of metrics that are essential for managing the overall system performance and health without over-burdening the monitoring tool.

CA UIM provides comprehensive monitoring of the Redis infrastructure, including stand alone, clustered and remote deployments. Combined with CA Operational Intelligence, CA UIM offers predictive insights around performance anomalies, alarm filtering and predictive capacity analytics to help IT admins identify potential issues proactively.

The section below provides a high-level view of metric categories that CA UIM provides for monitoring the Redis application and its corresponding infrastructure.

Resource Utilization Metrics

Utilization metrics help identify any bottlenecks with system resources such as CPU and memory are being utilized. It helps triage anomalies using the out-of-the-box monitoring configuration templates that are provided with standard thresholds applied.

With memory being the critical resource for Redis performance, metrics such as peak usage and fragmentation ration help manage its overall performance.

Latency and Performance Metrics

Cache Hit Ratio is one of the critical parameters to monitor when Redis is being used for cache management. A lower Cache Hit Ratio signifies large latency and indicates that there are too many disk operations being performed. In this case, one of the common recommendations is to increase the allocated memory for Redis.

Metrics like total received connections and rejected connections will help identify the load on the Redis server. Policy-based alarming will help configure alarms and notifications before the number of failed max connections reaches the configured threshold.

Persistence Metrics

When the Append Only Files (AOF) feature is enabled, Redis nodes write all of the cache data changes to an AOF ensuring no loss of data due to system restarts.

Since Redis uses a single thread to manages all network connections synchronously, the number of connections between the clients and the database can at times overwhelm the ability of the server to handle requests.

Cluster Status and Replication Metrics

Redis monitoring probe monitors the state and health of all master and slave nodes of a Redis cluster. It also ensures the replication status among all the cluster nodes.

With highly-scaled commercial offerings by Amazon Web Services, Microsoft Azure and Google, Redis is gaining popularity among non-Web Scale enterprises for many of the advantages cited above that go well beyond scalability.

For more information on Redis datastore monitoring and to find a complete list of metrics that are supported with CA UIM monitoring for Redis, see our product documentation.