Clusters
This section explains the procedure to view and manage Clusters.
The Cluster table provides a quick and easy way to see the status of your cluster.
Clusters Table
The Clusters table can be found under Resources in the NVIDIA Run:ai platform.
The clusters table provides a list of the clusters added to NVIDIA Run:ai platform, along with their status.

The clusters table consists of the following columns:
Cluster
The name of the cluster
Kubernetes distribution
The flavor of Kubernetes distribution
Kubernetes version
The version of Kubernetes installed
Status
The status of the cluster. For more information see the table below. Hover over the information icon for a short description and links to troubleshooting
Last connected
Indicates the most recent time the cluster successfully connected to the control plane.
If the cluster is currently connected, the value is displayed as Now.
If the cluster is disconnected or has experienced issues, the exact timestamp of the last successful connection is displayed.
Network topologies
The network topologies associated with the cluster
Creation time
The timestamp when the cluster was created
URL
The URL that was given to the cluster
NVIDIA Run:ai cluster version
The NVIDIA Run:ai version installed on the cluster
NVIDIA Run:ai cluster UUID
The unique ID of the cluster
Cluster Status
Waiting to connect
The cluster has never been connected.
Disconnected
There is no communication from the cluster to the Control plane. This may be due to a network issue. See troubleshooting scenarios.
Missing prerequisites
Some prerequisites are missing from the cluster. As a result, some features may be impacted. See troubleshooting scenarios.
Service issues
At least one of the services is not working properly. You can view the list of nonfunctioning services for more information. See troubleshooting scenarios.
Connected
The NVIDIA Run:ai cluster is connected, and all NVIDIA Run:ai services are running.
Network Topologies Associated with the Cluster
Click one of the values in the Network topologies column to view the list of network topologies and their parameters.
Topology
The name of the topology
Labels
The ordered set of node label keys that define the topology hierarchy
Created by
The user who created the network topology
Creation time
The timestamp of when the network topology was created
Customizing the Table View
Filter - Click ADD FILTER, select the column to filter by, and enter the filter values
Search - Click SEARCH and type the value to search by
Sort - Click each column header to sort by
Column selection - Click COLUMNS and select the columns to display in the table
Download table - Click MORE and then Click Download as CSV. Export to CSV is limited to 20,000 rows.
Adding a New Cluster
To add a new cluster, see the installation guide.
Managing Network Topologies
Network topologies optimize placement and accelerate distributed workloads by keeping pods on nodes that are as close to each other as possible in the network. For more details, see Accelerating workloads with network topology-aware scheduling.
To add topologies that represent the cluster's network:
Select the cluster you want to add a network topology for
In the top action bar, click NETWORK TOPOLOGIES
In the Network Topologies Associated with <Cluster Name> modal, click + NETWORK TOPOLOGY
Enter a unique name for the topology. If the name already exists, you will be requested to enter a different name.
Click + LABEL to add the node label keys that represent the network hierarchy
Order labels from farthest (first) to closest (last)
Ensure the labels match the corresponding keys on the nodes. For example:
cloud.provider.com/topology-block
,cloud.provider.com/topology-rack
,kubernetes.io/hostname
Drag labels to adjust their order if needed
Click SAVE NETWORK TOPOLOGY
To delete a topology, click the trash icon next to the topology entry in the modal.
Removing a Cluster
Select the cluster you want to remove
Click REMOVE
A dialog appears: Make sure to carefully read the message before removing
Click REMOVE to confirm the removal.
Using the API
Go to the Clusters API reference to view the available actions
Troubleshooting
Before starting, make sure you have access to the Kubernetes cluster where NVIDIA Run:ai is deployed with the necessary permissions
Troubleshooting Scenarios
Last updated