ONES Rule Engine
Overview
In data center operations, a rule engine with alerts for various metrics is essential for proactive monitoring and management of critical components and services. Let's discuss the need for rule engine alerts for specific metrics in a data centre environment
CPU and Memory Utilisation
Fan and PSU LED status
SSD Memory Utilization, Health and Temperature Status
Traffic Bandwidth
ASIC Routes
Health Services
Device Down alerts
BGP Neighbour Down alter
Component failure
Interface Flap Alerts
Traffic Errors and Discard Counters
PFC Counters
Device Queue Counters
Rule engine alerts ensure efficient resource utilization, timely troubleshooting, early detection of potential issues, and overall operational stability within the data centre environment.
Notification
ONES-App is capable of triggering breached threshold values to
Slack Channel
Zendesk Support
ServiceNow
Rules are categorized based on the metric hierarchy
Device Level
Interface Level
List of all the Metrics Supported by Rule Engine with possible units and measured value a user can use
Hierarchy
Metrics
Unit
Measure
Value
Device
CPU Utilization
Percentage (%)
AVG/MIN/MAX
0/100
Device
Memory Utilization
Percentage (%)
AVG/MIN/MAX
0/100
Device
Failed Fans
Count ()
MIN/MAX
Count
Device
Failed PSU
Count ()
MIN/MAX
Count
Device
CPU Core Temperature
Celsius ()
AVG/MIN/MAX
Celsius
Device
PSU Temperature
Celsius ()
AVG/MIN/MAX
Celsius
Device
FAN Speed
Percentage (%)
AVG/MIN/MAX
0/100
Device
ASIC IPv4 Routes Utilization
Percentage (%)
AVG/MIN/MAX
0/100
Device
ASIC IPv6 Routes Utilization
Percentage (%)
AVG/MIN/MAX
0/100
Device
BGP Nbrs Operationally Down
Count ()
AVG/MIN/MAX
Count of Nbrs
Device
FRR Container CPU Utilization
Percentage (%)
AVG/MIN/MAX
0/100
Device
Syncd Container CPU Utilization
Percentage (%)
AVG/MIN/MAX
0/100
Device
Device Down
NA
NA
NA
Device
Queue Counter
Count()
AVG/MIN/MAX
Count
Device
SSD Health
Percentage(%)
Percentage(%)
0/100
Device
SSD Temperature
Celsius ()
AVG/MIN/MAX
Celsius
Device
SSD Memory
Percentage(%)
Percentage(%)
0/100
Interface
Int Flap
NA
NA
NA
Interface
PFC Counters
Count ()
AVG/MIN/MAX
Count
Interfaec
Queue Counters
Count ()
AVG/MIN/MAX
Count
Interface
TX Utilization
Percentage (%)
AVG/MIN/MAX
0/100
Interface
RX Utilization
Percentage (%)
AVG/MIN/MAX
0/100
Interface
In Errors
Count ()
AVG/MIN/MAX
User defined
Interface
Out Errors
Count ()
AVG/MIN/MAX
User defined
Interface
In Discards
Count ()
AVG/MIN/MAX
User defined
Interface
Out Discards
Count ()
AVG/MIN/MAX
User defined
Interface
Tranx TX Power
dBm
AVG/MIN/MAX
User defined
Interface
Tranx Rx Power
dBm
AVG/MIN/MAX
User defined
Interface
Tranx Temperature
Celscius ()
AVG/MIN/MAX
User defined
Interface
Tranx Voltage
Volts ()
AVG/MIN/MAX
User defined
Last updated