ONES Server Agent Installation

Server Agent v3.0 support:

Only Intel Guadi platform is supported in ONESv3.0

Supported Platform details

Device
GPU
GPU VENDOR ID
GPU DEVICE ID
NIC VENDOR ID

Device

GPU

GPU VENDOR ID

GPU DEVICE ID

NIC VENDOR ID

Intel Gaudi

GAUDI2

HABANALABS (0x1DA3)

GAUDI2 (1020)

HABANALABS (0x1da3)

Intel Gaudi

GAUDI2_HL2000

HABANALABS (0x1DA3)

GAUDI2_HL2000 (1010)

MELLANOX TECHNOLOGIES(0x15b3)

  1. ONESv3.0 Agent support auto-discovery feature

  2. ONESv3.0 Agent support to send telemetry on multiple controllers (Max 2)

  3. Restrict IP feature can be enabled/disabled

  4. Password-based authentication is supported

  5. Using the Restrict IP feature agent will discover the ONES Controller and will update the entry on the ONES App with all the feature metrics

  6. Need to add a few inputs while installing the agent

Installation

On the Application machine, go to ONES-3.0/ones_t_server_agent directory

root@ones-application:~$ cd /ONES-3.0/ones_t_server_agent

Installation (Agent Install on multiple switches at the same time)

  • Enter device details (Management IP, Username and Password ) in device_info.csv

root@ones-application/ONES-3.0/ones_t_server_agent:~$ vi device_info.csv

The user needs to add all the required details in the CSV file, This CSV file will be used to push this information to agent.conf(/etc/sonic/agent.conf) file to every switch and ones-agent on the switch will pick the details from agent.conf file and will register itself to ONES controller with all the given parameters this helps a NetOps engineer to directly add a CSV file containing all the details, The Engineer needs not to add one by one devices on the controller which actually is time-consuming

ip,user,passwd,layer,region,type,groupid,azid,brickid,rackid
"10.20.0.80","admin","YourPaSsWoRd","Server","San_Jose_Lab","Server",1,1,1,1
...
...
...

The user must maintain the layer names exactly as specified above (case-sensitive). If the user inputs names that differ from these, they may encounter issues when using the ONES application.

  • Save the File

Executing the installation script can be used for installing a telemetry agent on one or more devices in the data centre.

The installer file automatically detects & will process fresh installation or upgrade to the new version

While upgrading, all the previous files will automatically get deleted on the Switch

If users want to use the certificate for GNMI & Auto-Registration, so users need to put the certificate in directory gnmi-certs(for GNMI) & auto-reg-certs(for Agent Auto Registration)

root@ones-application/ONES-3.0/ones_t_server_agent:~$ ./ones_agent_parallel_installer.sh

  1. Users can update only password in case server is already having agent running.

Do you want to update the username/password for password authentication? (Yes/No): no
  1. Users can choose this option to only update one more controller IP without doing the complete agent installation.

Do you want to add only Collector IP for auto-discovery and skip the agent installation ?(yes/no): no 
  1. Scripts asks to put the Controller IP to use auto-discovery feature

Enter the IP addresses of collectors to auto-discover (max 2, comma-separated, e.g., 10.1.1.10,10.2.2.5):10.20.0.93

User can only add 2 Controller IP to restrict the telemetry streaming

  1. User can choose the restriction to send telemetry to collector IP only

Do you want to restrict access only to provided collector ip?
Note: Providing Yes will restrict access to agent only with the provided collector IP Address
Enter Yes/No : Yes

It's important to restrict collector IP as No in case the running network has NAT translation from private to public IP for ONES server access from the device.

  1. User can enable password based authentication between agnet and collector

    Do you want to enable password authentication? (yes/no): yes
    Enter the new username: admin
    Enter the new password: YourPaSsWoRd

    user can choose no if user do not want to use password based authentication between agnet and collector.

    Do you want to enable password authentication? (yes/no): no

Installation Begin

Do you want to update the username/password for password authentication? (Yes/No): no
Do you want to add only Collector IP for auto-discovery and skip the agent installation? (yes/no): no
Enter the IP addresses of collectors to auto-discover (max 2, comma-separated, e.g., 10.1.1.10,10.2.2.5): 10.20.0.93
Do you want to restrict access only to the provided collector IP?
Note: Providing Yes will restrict access to agent only with the provided collector IP Address
Enter Yes/No: no
Do you want to enable password authentication? (Yes/No): no
f58d795dfab9: Loading layer [==================================================>]   2.56kB/2.56kB
ed46ea0f4e17: Loading layer [==================================================>]  31.74MB/31.74MB
1c38a701a3d6: Loading layer [==================================================>]   42.7MB/42.7MB
c4456c24c820: Loading layer [==================================================>]  1.421MB/1.421MB
Loaded image: avizdock/agent_installer:latest
Docker image 'avizdock/agent_installer' is loaded.
b17757c75cda3c71ff4d1311c116c6143893726ddce7dead02b0d77cc926fc5c
Docker container 'agent_installer' is running.
/usr/local/lib/python3.8/site-packages/paramiko/pkey.py:82: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from this module in 48.0.0.
  "cipher": algorithms.TripleDES,
/usr/local/lib/python3.8/site-packages/paramiko/transport.py:253: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from this module in 48.0.0.
  "class": algorithms.TripleDES,
Selecting ‘Yes’ will exclusively initiate the day-2 deployment of the Ones-Agent,
involving a reconfiguration of the existing agent to establish communication with the specified collector(s).
Choosing ‘No’ will initiate the deployment of the Ones-Agent as an independent third-party container.
[{'ip': '10.20.0.80', 'user': 'aviz', 'passwd': 'Aviz@123', 'layer': 'Server', 'region': 'San Jose', 'azid': '1', 'brickid': '1', 'rackid': '1', 'groupid': '8', 'type': 'Server', 'installation_instance': 1, 'agentip': '10.20.0.80', 'collectorip': '10.20.0.93', 'restrict_collector_ip': 'no', 'password_authentication': 'no'}]
###############Connecting to switch###############
Connection to switch 10.20.0.80 successful.....................
Looking for previous installation........................
Copying files to the switch........................
Verifying files on the remote switch........................
File /home/aviz/docker_packages.tar.gz exists on the remote server.
File /home/aviz/prerequisites.sh exists on the remote server.
File verification completed.
Untarring docker_packages.tar.gz on the remote server...
Successfully untarred docker_packages.tar.gz.
Setting execute permissions on prerequisites.sh...
Successfully set execute permissions on prerequisites.sh.
Executing prerequisites.sh on the remote server...

....................
....................
....................
....................
....................
(Reading database ... 116661 files and directories currently installed.)
Preparing to unpack .../sshpass_1.06-1_amd64.deb ...
Unpacking sshpass (1.06-1) over (1.09-1) ...
Setting up sshpass (1.06-1) ...
Processing triggers for man-db (2.10.2-1) ...
No previous installation found  on the device 10.20.0.80........
Creating work directory  on the device 10.20.0.80........
Work Directory ones-agent_1727157788_9940367 created successfully on the device 10.20.0.80 .............
Copying ones_agent_start.sh to directory ones-agent_1727157788_9940367 on the device 10.20.0.80 .............
Copying ones_agent_start.sh to directory ones-agent_1727157788_9940367 successful on the device 10.20.0.80 .............
ones_agent_start.sh file copied to /usr/bin successfully on the device 10.20.0.80........

Copying ones_agent_ip_rule.sh to directory ones-agent_1727157788_9940367 on the device 10.20.0.80 .............
Copying ones_agent_ip_rule.sh to directory ones-agent_1727157788_9940367 successful on the device 10.20.0.80 .............
ones_agent_ip_rule.sh file copied to /usr/bin successfully on the device 10.20.0.80........
Copying ones-agent.service to directory ones-agent_1727157788_9940367 on the device 10.20.0.80 .............
Copying ones-agent.service to directory ones-agent_1727157788_9940367 successful on the device 10.20.0.80 .............

Installation proceeding with NoTls mode
Copying agent.conf to directory ones-agent_1727157788_9940367 successful on the device 10.20.0.80 .............
agent.conf copied to /etc/ones successfully on the device 10.20.0.80........
Copying ones-agent.tar to directory ones-agent_1727157788_9940367 on the device 10.20.0.80 .............
Copying ones-agent.tar to directory ones-agent_1727157788_9940367 on the device 10.20.0.80 .............
Loading Docker image on the device 10.20.0.80 ###########################################
Docker image loaded successfully on the device 10.20.0.80........
Getting name of the loaded image
image = ##avizdock/ones-server-agent:v3.0.0##
Running docker.....................
docker run -it -v /usr/bin/hl-smi:/usr/bin/hl-smi -v /etc/ones:/etc/ones -v /etc/os-release:/etc/os-release-origin --cpu-period=100000 --cpu-quota=50000 --net=host --privileged -dt --name ones-agent avizdock/ones-server-agent:v3.0.0
b'8435e8edc34c90c3d378a89769dc5167e020095818ede8178ac675c96d37ecd3\n'
Service file loaded successfully on the device 10.20.0.80##################
Enabling ones-agent.service 10.20.0.80 ##################
Enabled ones-agent as service successfully on the device 10.20.0.80 ##################
Starting ones-agent service on the device 10.20.0.80........
Made ones-agent immune to booting on the device 10.20.0.80########################
Copying ones-agent.tar file
ones-agent.tar file copied successfully on the device 10.20.0.80........
Copying agent.conf file
agent.conf file copied successfully on the device 10.20.0.80........
Copying ones-agent.service file
ones-agent.service file copied successfully on the device 10.20.0.80........
Copying ones_agent_ip_rule.sh file
ones_agent_ip_rule.sh file copied successfully on the device 10.20.0.80........
Copying ones_agent_start.sh file
ones_agent_start.sh file copied successfully on the device 10.20.0.80........
##################################################################
Status of ones-agent.service is -      Active: active (running) since Tue 2024-09-24 06:06:03 UTC; 2min 34s ago

Deployment of ones-agent to switch 10.20.0.80 is successful
╒══════════════╤══════════╕
│ IP Address   │ Result   │
╞══════════════╪══════════╡
10.20.0.80   │ Pass     │
╘══════════════╧══════════╛
agent_installer

Docker agent_installer has been stopped
agent_installer

Docker agent_installer has been removed
Untagged: avizdock/agent_installer:latest
Deleted: sha256:b115eb21a63518b47079a0f9b25ed56e8dd807a4aa054dc18efb1d5635b9728d
Deleted: sha256:a2052350dbedd8d19d573f1f81a333af50d33c157dd565c6fb3ea19ff32d7869
Deleted: sha256:403906165705c1c4c263865c7d2e8560424306ef76cc7dfd319565e1036a4b49
Deleted: sha256:71d4c516421d0cd5b08b0c7f7ddff68182ca799815c621e1bf1d7c2a247820f2
Deleted: sha256:ed5221ab4eb63334a3121c173d4b6e0fb882b13eb0de6f1daa781908da91a464

Docker avizdock/agent_installer image has been removed

Now Server-Agent will only stream the metrics to the given controller & will autoregister on the ONES-App

The user needs to make sure, The devices have a unique name, otherwise, there will issue while plotting the full topology view(Topology Page).

Last updated

Copyright © Aviz Networks, Inc.