Only Intel Guadi platform is supported in ONESv3.0
Supported Platform details
Device
GPU
GPU VENDOR ID
GPU DEVICE ID
NIC VENDOR ID
Device
GPU
GPU VENDOR ID
GPU DEVICE ID
NIC VENDOR ID
Intel Gaudi
GAUDI2
HABANALABS (0x1DA3)
GAUDI2 (1020)
HABANALABS (0x1da3)
Intel Gaudi
GAUDI2_HL2000
HABANALABS (0x1DA3)
GAUDI2_HL2000 (1010)
MELLANOX TECHNOLOGIES(0x15b3)
ONESv3.0 Agent support auto-discovery feature
ONESv3.0 Agent support to send telemetry on multiple controllers (Max 2)
Restrict IP feature can be enabled/disabled
Password-based authentication is supported
Using the Restrict IP feature agent will discover the ONES Controller and will update the entry on the ONES App with all the feature metrics
Need to add a few inputs while installing the agent
Installation
On the Application machine, go to ONES-3.0/ones_t_server_agent directory
root@ones-application:~$ cd /ONES-3.0/ones_t_server_agent
Installation (Agent Install on multiple switches at the same time)
Enter device details (Management IP, Username and Password ) in device_info.csv
root@ones-application/ONES-3.0/ones_t_server_agent:~$ vi device_info.csv
The user needs to add all the required details in the CSV file, This CSV file will be used to push this information to agent.conf(/etc/sonic/agent.conf) file to every switch and ones-agent on the switch will pick the details from agent.conf file and will register itself to ONES controller with all the given parameters
this helps a NetOps engineer to directly add a CSV file containing all the details, The Engineer needs not to add one by one devices on the controller which actually is time-consuming
The user must maintain the layer names exactly as specified above (case-sensitive). If the user inputs names that differ from these, they may encounter issues when using the ONES application.
Save the File
Executing the installation script can be used for installing a telemetry agent on one or more devices in the data centre.
The installer file automatically detects & will process fresh installation or upgrade to the new version
While upgrading, all the previous files will automatically get deleted on the Switch
If users want to use the certificate for GNMI & Auto-Registration, so users need to put the certificate in directory gnmi-certs(for GNMI) & auto-reg-certs(for Agent Auto Registration)
Users can update only password in case server is already having agent running.
Do you want to update the username/password for password authentication? (Yes/No): no
Users can choose this option to only update one more controller IP without doing the complete agent installation.
Do you want to add only Collector IP for auto-discovery and skip the agent installation ?(yes/no): no
Scripts asks to put the Controller IP to use auto-discovery feature
Enter the IP addresses of collectors to auto-discover (max 2, comma-separated, e.g., 10.1.1.10,10.2.2.5):10.20.0.93
User can only add 2 Controller IP to restrict the telemetry streaming
User can choose the restriction to send telemetry to collector IP only
Do you want to restrict access only to provided collector ip?Note: Providing Yes will restrict access to agent only with the provided collector IP AddressEnter Yes/No : Yes
It's important to restrict collector IP as No in case the running network has NAT translation from private to public IP for ONES server access from the device.
User can enable password based authentication between agnet and collector
Do you want to enable password authentication? (yes/no): yesEnter the new username: adminEnter the new password: YourPaSsWoRd
user can choose no if user do not want to use password based authentication between agnet and collector.
Do you want to enable password authentication? (yes/no): no
Installation Begin
Do you want to update the username/password for password authentication? (Yes/No): noDo you want to add only Collector IP for auto-discovery and skip the agent installation? (yes/no): noEnter the IP addresses of collectors to auto-discover (max 2, comma-separated, e.g., 10.1.1.10,10.2.2.5): 10.20.0.93Do you want to restrict access only to the provided collector IP?Note: Providing Yes will restrict access to agent only with the provided collector IP AddressEnter Yes/No: noDo you want to enable password authentication? (Yes/No): nof58d795dfab9: Loading layer [==================================================>] 2.56kB/2.56kBed46ea0f4e17: Loading layer [==================================================>] 31.74MB/31.74MB1c38a701a3d6: Loading layer [==================================================>] 42.7MB/42.7MBc4456c24c820: Loading layer [==================================================>] 1.421MB/1.421MBLoaded image: avizdock/agent_installer:latestDocker image 'avizdock/agent_installer' is loaded.b17757c75cda3c71ff4d1311c116c6143893726ddce7dead02b0d77cc926fc5cDocker container 'agent_installer' is running./usr/local/lib/python3.8/site-packages/paramiko/pkey.py:82: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from this module in 48.0.0."cipher": algorithms.TripleDES,/usr/local/lib/python3.8/site-packages/paramiko/transport.py:253: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from this module in 48.0.0."class": algorithms.TripleDES,Selecting ‘Yes’ will exclusively initiate the day-2 deployment of the Ones-Agent,involving a reconfiguration of the existing agent to establish communication with the specified collector(s).Choosing ‘No’ will initiate the deployment of the Ones-Agent as an independent third-party container.[{'ip': '10.20.0.80', 'user': 'aviz', 'passwd': 'Aviz@123', 'layer': 'Server', 'region': 'San Jose', 'azid': '1', 'brickid': '1', 'rackid': '1', 'groupid': '8', 'type': 'Server', 'installation_instance': 1, 'agentip': '10.20.0.80', 'collectorip': '10.20.0.93', 'restrict_collector_ip': 'no', 'password_authentication': 'no'}]###############Connecting to switch###############Connection to switch 10.20.0.80 successful.....................Looking for previous installation........................Copying files to the switch........................Verifying files on the remote switch........................File /home/aviz/docker_packages.tar.gz exists on the remote server.File /home/aviz/prerequisites.sh exists on the remote server.File verification completed.Untarring docker_packages.tar.gz on the remote server...Successfully untarred docker_packages.tar.gz.Setting execute permissions on prerequisites.sh...Successfully set execute permissions on prerequisites.sh.Executing prerequisites.sh on the remote server.......................................................................................................(Reading database ... 116661 files and directories currently installed.)Preparing to unpack .../sshpass_1.06-1_amd64.deb ...Unpacking sshpass (1.06-1) over (1.09-1) ...Setting up sshpass (1.06-1) ...Processing triggers for man-db (2.10.2-1) ...No previous installation found on the device 10.20.0.80........Creating work directory on the device 10.20.0.80........Work Directory ones-agent_1727157788_9940367 created successfully on the device 10.20.0.80 .............Copying ones_agent_start.sh to directory ones-agent_1727157788_9940367 on the device 10.20.0.80 .............Copying ones_agent_start.sh to directory ones-agent_1727157788_9940367 successful on the device 10.20.0.80 .............ones_agent_start.sh file copied to /usr/bin successfully on the device 10.20.0.80........Copying ones_agent_ip_rule.sh to directory ones-agent_1727157788_9940367 on the device 10.20.0.80 .............Copying ones_agent_ip_rule.sh to directory ones-agent_1727157788_9940367 successful on the device 10.20.0.80 .............ones_agent_ip_rule.sh file copied to /usr/bin successfully on the device 10.20.0.80........Copying ones-agent.service to directory ones-agent_1727157788_9940367 on the device 10.20.0.80 .............Copying ones-agent.service to directory ones-agent_1727157788_9940367 successful on the device 10.20.0.80 .............Installation proceeding with NoTls modeCopying agent.conf to directory ones-agent_1727157788_9940367 successful on the device 10.20.0.80 .............agent.conf copied to /etc/ones successfully on the device 10.20.0.80........Copying ones-agent.tar to directory ones-agent_1727157788_9940367 on the device 10.20.0.80 .............Copying ones-agent.tar to directory ones-agent_1727157788_9940367 on the device 10.20.0.80 .............Loading Docker image on the device 10.20.0.80 ###########################################Docker image loaded successfully on the device 10.20.0.80........Getting name of the loaded imageimage = ##avizdock/ones-server-agent:v3.0.0##Running docker.....................docker run -it -v /usr/bin/hl-smi:/usr/bin/hl-smi -v /etc/ones:/etc/ones -v /etc/os-release:/etc/os-release-origin --cpu-period=100000 --cpu-quota=50000 --net=host --privileged -dt --name ones-agent avizdock/ones-server-agent:v3.0.0b'8435e8edc34c90c3d378a89769dc5167e020095818ede8178ac675c96d37ecd3\n'Service file loaded successfully on the device 10.20.0.80##################Enabling ones-agent.service10.20.0.80 ##################Enabled ones-agent as service successfully on the device 10.20.0.80 ##################Starting ones-agent service on the device 10.20.0.80........Made ones-agent immune to booting on the device 10.20.0.80########################Copying ones-agent.tar fileones-agent.tar file copied successfully on the device 10.20.0.80........Copying agent.conf fileagent.conf file copied successfully on the device 10.20.0.80........Copying ones-agent.service fileones-agent.service file copied successfully on the device 10.20.0.80........Copying ones_agent_ip_rule.sh fileones_agent_ip_rule.sh file copied successfully on the device 10.20.0.80........Copying ones_agent_start.sh fileones_agent_start.sh file copied successfully on the device 10.20.0.80........##################################################################Status of ones-agent.service is - Active: active (running) since Tue 2024-09-2406:06:03 UTC; 2min 34s agoDeployment of ones-agent to switch 10.20.0.80 is successful╒══════════════╤══════════╕│ IP Address │ Result │╞══════════════╪══════════╡│ 10.20.0.80 │ Pass │╘══════════════╧══════════╛agent_installerDocker agent_installer has been stoppedagent_installerDocker agent_installer has been removedUntagged: avizdock/agent_installer:latestDeleted: sha256:b115eb21a63518b47079a0f9b25ed56e8dd807a4aa054dc18efb1d5635b9728dDeleted: sha256:a2052350dbedd8d19d573f1f81a333af50d33c157dd565c6fb3ea19ff32d7869Deleted: sha256:403906165705c1c4c263865c7d2e8560424306ef76cc7dfd319565e1036a4b49Deleted: sha256:71d4c516421d0cd5b08b0c7f7ddff68182ca799815c621e1bf1d7c2a247820f2Deleted: sha256:ed5221ab4eb63334a3121c173d4b6e0fb882b13eb0de6f1daa781908da91a464Docker avizdock/agent_installer image has been removed
Now Server-Agent will only stream the metrics to the given controller & will autoregister on the ONES-App
The user needs to make sure, The devices have a unique name, otherwise, there will issue while plotting the full topology view(Topology Page).