Rule Engine
Overview
In data centre operations, a rule engine with alerts for various metrics is essential for proactive monitoring and management of critical components and services. Let's see the different types of rule engine alerts for specific metrics in a data centre environment
CPU and Memory Alerts
Fan and Power Supply Unit Alerts
Traffic Bandwidth
ASIC IPv4 & IPv6 Routes
BGP Neighbour Alerts
Health Services
Device Down Alerts
SSD Health, temperature and memory usage alert
Device Queue counters
PFC counters
Traffic Errors and Discard Counters
frr and syncd services CPU utilization status
Server Agent based metrics
CPU Temperature and Utilization
Down status
FAN Speed
Memory Utilization
GPU
Memory Utilization
PSU Power Draw
Temperature
Utilization
Push Notification
Rule Engine pushes the configured rule notification in case any device breaches the threshold value configured under the rule to
Slack channel
Zendesk Support ticket
Service Now ticket
To use Rule Engine Alert feature User needs to setup first Slack channel integration, Zendesk Support integration or Service-Now integration
Last updated