ONES Rule Engine

Overview

In data center operations, a rule engine with alerts for various metrics is essential for proactive monitoring and management of critical components and services. Let's discuss the need for rule engine alerts for specific metrics in a data centre environment

  1. CPU and Memory Utilisation

  2. Fan and PSU LED status

  3. Traffic Bandwidth

  4. ASIC Routes

  5. Health Services

  6. Traffic Errors and Discard Counters

Rule engine alerts ensure efficient resource utilization, timely troubleshooting, early detection of potential issues, and overall operational stability within the data centre environment.

Notification

ONES-App is capable of triggering breached threshold values to

  • Slack Channel

  • Zendesk Support ticket

Rules are categorized based on the metric hierarchy

  1. Device Level

  2. Interface Level

List of all the Metrics Supported by Rule Engine with possible units and measured value a user can use

Hierarchy

Metrics

Unit

Measure

Value

Device

CPU Utilization

Percentage (70%)

AVG/MIN/MAX

0/100

Device

Memory Utilization

Percentage (50%)

AVG/MIN/MAX

0/100

Device

Fan LED Status

String (RED or GREEN)

RED

RED/GREEN

Device

PSU Status

String (LED RED or Status=not OK)

RED or NOT OK

Device

CPU Core Temperature

Celscius (30)

AVG/MIN/MAX

Device

PSU Temperature

Celscius (30)

AVG/MIN/MAX

Device

FAN Speed

Percentage (70%)

AVG/MIN/MAX

0/100

Device

ASIC IPv4 Routes Utilization

Percentage (70%)

AVG/MIN/MAX

0/100

Device

ASIC IPv6 Routes Utilization

Percentage (80%)

AVG/MIN/MAX

0/100

Device

BGP Nbrs Operationally Down

Percentage (20%)

AVG/MIN/MAX

0/100

Device

FRR Container CPU Utilization

Percentage (20%)

AVG/MIN/MAX

0/100

Device

Syncd Container CPU Utilization

Percentage (20%)

AVG/MIN/MAX

0/100

Interface

TX Utilization

Percentage (80%)

AVG/MIN/MAX

0/100

Interface

RX Utilization

Percentage (80%)

AVG/MIN/MAX

0/100

Interface

In Errors

Count (100)

AVG/MIN/MAX

User defined

Interface

Out Errors

Count (50)

AVG/MIN/MAX

User defined

Interface

In Discards

Count (100)

AVG/MIN/MAX

User defined

Interface

Out Discards

Count (50)

AVG/MIN/MAX

User defined

Interface

Tranx TX Power

dBm

AVG/MIN/MAX

User defined

Interface

Tranx Rx Power

dBm

AVG/MIN/MAX

User defined

Interface

Tranx Temperature

Celscius (40)

AVG/MIN/MAX

User defined

Interface

Tranx Voltage

Volts (40)

AVG/MIN/MAX

User defined