# Metrics and Telemetry

Metrics are numeric measurements recorded **over time** that are emitted from the NVIDIA Run:ai cluster and telemetry is a numeric measurement recorded in real-time when emitted from the NVIDIA Run:ai cluster.

## Scopes

NVIDIA Run:ai provides control-plane API which supports and aggregates analytics at various levels.

| Level      | Description                                                                                                                                                                                           |
| ---------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Cluster    | A cluster is a set of nodes pools and nodes. With Cluster metrics, metrics are aggregated at the Cluster level. In the NVIDIA Run:ai user interface, metrics are available in the Overview dashboard. |
| Node       | Data is aggregated at the node level.                                                                                                                                                                 |
| Node pool  | Data is aggregated at the node pool level.                                                                                                                                                            |
| Workload   | Data is aggregated at the workload level. In some workloads, e.g. with distributed workloads, these metrics aggregate data from all worker pods.                                                      |
| Pod        | The basic unit of execution.                                                                                                                                                                          |
| Project    | The basic organizational unit. Projects are the tool to implement resource allocation policies as well as the segregation between different initiatives.                                              |
| Department | Departments are a grouping of projects.                                                                                                                                                               |

## Supported Metrics

| Metric name in API               | Applicable API endpoint                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | Metric name in UI per grid                                           | Applicable UI grid                                                                                                                                                                                                                                                |
| -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `ALLOCATED_GPU`                  | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                                     | <ul><li>GPU devices (allocated)</li><li>Allocated GPUs</li></ul>     | <ul><li><a href="../before-you-start#ui-views">Overview dashboard</a></li><li><a href="../../aiinitiatives/resources/node-pools#show-hide-details">Node pools</a></li></ul>                                                                                       |
| `AVG_WORKLOAD_WAIT_TIME`         | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                                     |                                                                      |                                                                                                                                                                                                                                                                   |
| `CPU_LIMIT_CORES`                | [Workloads](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | CPU limit                                                            | [Workloads](https://run-ai-docs.nvidia.com/self-hosted/2.20/workloads-in-nvidia-run-ai/workloads#metrics)                                                                                                                                                         |
| `CPU_MEMORY_LIMIT_BYTES`         | [Workloads](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | CPU memory limit                                                     | [Workloads](https://run-ai-docs.nvidia.com/self-hosted/2.20/workloads-in-nvidia-run-ai/workloads#metrics)                                                                                                                                                         |
| `CPU_MEMORY_REQUEST_BYTES`       | [Workloads](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | CPU memory request                                                   | [Workloads](https://run-ai-docs.nvidia.com/self-hosted/2.20/workloads-in-nvidia-run-ai/workloads#metrics)                                                                                                                                                         |
| `CPU_MEMORY_USAGE_BYTES`         | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-workloadid-metrics">Workloads</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li></ul>                                                                                                                                                                                                                                                                                                                                                  | CPU memory usage                                                     | [Workloads](https://run-ai-docs.nvidia.com/self-hosted/2.20/workloads-in-nvidia-run-ai/workloads#metrics)                                                                                                                                                         |
| `CPU_MEMORY_UTILIZATION`         | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul>                                                                                                                                                                                      | CPU memory utilization                                               | <ul><li><a href="../before-you-start#ui-views">Overview dashboard</a></li><li><a href="../../aiinitiatives/resources/node-pools#show-hide-details">Node pools</a></li><li><a href="../../aiinitiatives/resources/nodes#show-hide-details">Nodes</a></li></ul>     |
| `CPU_REQUEST_CORES`              | [Workloads](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | CPU request                                                          | [Workloads](https://run-ai-docs.nvidia.com/self-hosted/2.20/workloads-in-nvidia-run-ai/workloads)                                                                                                                                                                 |
| `CPU_USAGE_CORES`                | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-workloadid-metrics">Workloads</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li></ul>                                                                                                                                                                                                                   | CPU usage                                                            | [Workloads](https://run-ai-docs.nvidia.com/self-hosted/2.20/workloads-in-nvidia-run-ai/workloads#metrics)                                                                                                                                                         |
| `CPU_UTILIZATION`                | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul>                                                                                                                                                                                      | <ul><li>CPU compute utilization</li><li>CPU utilization</li></ul>    | <ul><li><a href="../before-you-start#ui-views">Overview dashboard</a> and <a href="../../aiinitiatives/resources/node-pools#show-hide-details">Node pools</a></li><li><a href="../../aiinitiatives/resources/nodes#show-hide-details">Nodes</a></li></ul>         |
| `GPU_ALLOCATION`                 | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-workloadid-metrics">Workloads</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-departmentid-metrics">Departments</a></li></ul>                                                                                                                                                                          | GPU devices (allocated)                                              | [Overview dashboard](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/before-you-start#ui-views)                                                                                                                                               |
| `GPU_MEMORY_REQUEST_BYTES`       | [Workloads](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | GPU memory request                                                   | [Workloads](https://run-ai-docs.nvidia.com/self-hosted/2.20/workloads-in-nvidia-run-ai/workloads#metrics)                                                                                                                                                         |
| `GPU_MEMORY_USAGE_BYTES`         | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-workloadid-metrics">Workloads</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul>                                                                                                                                                                                                                   | GPU memory usage                                                     | [Workloads](https://run-ai-docs.nvidia.com/self-hosted/2.20/workloads-in-nvidia-run-ai/workloads#metrics)                                                                                                                                                         |
| `GPU_MEMORY_USAGE_BYTES_PER_GPU` | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li></ul>                                                                                                                                                                                                                                                                                                                                                              | GPU memory usage per GPU                                             | [Workloads per pod](https://run-ai-docs.nvidia.com/self-hosted/2.20/workloads-in-nvidia-run-ai/workloads#metrics)                                                                                                                                                 |
| `GPU_MEMORY_UTILIZATION`         | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                                     | GPU memory utilization                                               | <ul><li><a href="../before-you-start#ui-views">Overview dashboard</a></li><li><a href="../../aiinitiatives/resources/node-pools#show-hide-details">Node pools</a></li></ul>                                                                                       |
| `GPU_MEMORY_UTILIZATION_PER_GPU` | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-nodeid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | GPU memory utilization per GPU                                       | [Nodes](https://run-ai-docs.nvidia.com/self-hosted/2.20/aiinitiatives/resources/nodes#show-hide-details)                                                                                                                                                          |
| `GPU_QUOTA`                      | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-departmentid-metrics">Departments</a></li></ul> | Quota                                                                | [Quota management](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/before-you-start#ui-views)                                                                                                                                                 |
| `GPU_UTILIZATION`                | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-workloadid-metrics">Workloads</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li></ul>                              | GPU compute utilization                                              | <ul><li><a href="../before-you-start#ui-views">Overview dashboard</a></li><li><a href="../../aiinitiatives/resources/node-pools#show-hide-details">Node pools</a></li><li><a href="../../../workloads-in-nvidia-run-ai/workloads#metrics">Workloads</a></li></ul> |
| `GPU_UTILIZATION_PER_GPU`        | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li></ul>                                                                                                                                                                                                                                                                                                                                                              | GPU utilization per GPU                                              | [Nodes](https://run-ai-docs.nvidia.com/self-hosted/2.20/aiinitiatives/resources/nodes#show-hide-details)                                                                                                                                                          |
| `TOTAL_GPU`                      | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                                     | <ul><li>GPU devices total</li><li>Total GPUs</li></ul>               | <ul><li><a href="../before-you-start#ui-views">Overview dashboard</a></li><li><a href="../../aiinitiatives/resources/node-pools#show-hide-details">Node pools</a></li></ul>                                                                                       |
| `TOTAL_GPU_NODES`                | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                                     |                                                                      |                                                                                                                                                                                                                                                                   |
| `GPU_UTILIZATION_DISTRIBUTION`   | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                                     | GPU utilization distribution                                         | [Node pools](https://run-ai-docs.nvidia.com/self-hosted/2.20/aiinitiatives/resources/node-pools#show-hide-details)                                                                                                                                                |
| `UNALLOCATED_GPU`                | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                                     | <ul><li>GPU devices (unallocated)</li><li>Unallocated GPUs</li></ul> | <ul><li><a href="../before-you-start#ui-views">Overview dashboard</a></li><li><a href="../../aiinitiatives/resources/node-pools#show-hide-details">Node pools</a></li></ul>                                                                                       |
| `CPU_QUOTA_MILLICORES`           | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-departmentid-metrics">Departments</a></li></ul>                                                                                                                                                                                                                                                                                                                     |                                                                      |                                                                                                                                                                                                                                                                   |
| `CPU_MEMORY_QUOTA_MB`            | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-departmentid-metrics">Departments</a></li></ul>                                                                                                                                                                                                                                                                                                                     |                                                                      |                                                                                                                                                                                                                                                                   |
| `CPU_ALLOCATION_MILLICORES`      | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-departmentid-metrics">Departments</a></li></ul>                                                                                                                                                                                                                                                                                                                     |                                                                      |                                                                                                                                                                                                                                                                   |
| `CPU_MEMORY_ALLOCATION_MB`       | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-departmentid-metrics">Departments</a></li></ul>                                                                                                                                                                                                                                                                                                                     |                                                                      |                                                                                                                                                                                                                                                                   |
| `POD_COUNT`                      | [Workloads](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                                                                      |                                                                                                                                                                                                                                                                   |
| `RUNNING_POD_COUNT`              | [Workloads](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                                                                      |                                                                                                                                                                                                                                                                   |

### Advanced Metrics

NVIDIA provides extended metrics as shown here [here](https://docs.nvidia.com/datacenter/dcgm/latest/user-guide/feature-overview.html#profiling-metrics). To enable these metrics, please contact NVIDIA Run:ai customer support.

| Metric name in API                         | Applicable API endpoint                                                                                                  | Metric name in UI        | Applicable UI table                                                                                                                                                                       |
| ------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------ | ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `GPU_FP16_ENGINE_ACTIVITY_PER_GPU`         | [Pods](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics) | GPU FP16 engine activity | <ul><li><a href="../../aiinitiatives/resources/nodes#show-hide-details">Nodes</a></li><li><a href="../../../workloads-in-nvidia-run-ai/workloads#metrics">Workloads per pod</a></li></ul> |
| `GPU_FP32_ENGINE_ACTIVITY_PER_GPU`         | [Pods](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics) | GPU FP32 engine activity | <ul><li><a href="../../aiinitiatives/resources/nodes#show-hide-details">Nodes</a></li><li><a href="../../../workloads-in-nvidia-run-ai/workloads#metrics">Workloads per pod</a></li></ul> |
| `GPU_FP64_ENGINE_ACTIVITY_PER_GPU`         | [Pods](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics) | GPU FP64 engine activity | <ul><li><a href="../../aiinitiatives/resources/nodes#show-hide-details">Nodes</a></li><li><a href="../../../workloads-in-nvidia-run-ai/workloads#metrics">Workloads per pod</a></li></ul> |
| `GPU_GRAPHICS_ENGINE_ACTIVITY_PER_GPU`     | [Pods](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics) | Graphics engine activity | <ul><li><a href="../../aiinitiatives/resources/nodes#show-hide-details">Nodes</a></li><li><a href="../../../workloads-in-nvidia-run-ai/workloads#metrics">Workloads per pod</a></li></ul> |
| `GPU_MEMORY_BANDWIDTH_UTILIZATION_PER_GPU` | [Pods](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics) |                          |                                                                                                                                                                                           |
| `GPU_NVLINK_RECEIVED_BANDWIDTH_PER_GPU`    | [Pods](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics) |                          |                                                                                                                                                                                           |
| `GPU_NVLINK_TRANSMITTED_BANDWIDTH_PER_GPU` | [Pods](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics) |                          |                                                                                                                                                                                           |
| `GPU_PCIE_RECEIVED_BANDWIDTH_PER_GPU`      | [Pods](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics) |                          |                                                                                                                                                                                           |
| `GPU_PCIE_TRANSMITTED_BANDWIDTH_PER_GPU`   | [Pods](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics) |                          |                                                                                                                                                                                           |
| `GPU_SM_ACTIVITY_PER_GPU`                  | [Pods](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics) | GPU SM activity          | <ul><li><a href="../../aiinitiatives/resources/nodes#show-hide-details">Nodes</a></li><li><a href="../../../workloads-in-nvidia-run-ai/workloads#metrics">Workloads per pod</a></li></ul> |
| `GPU_SM_OCCUPANCY_PER_GPU`                 | [Pods](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics) | GPU SM occupancy         | <ul><li><a href="../../aiinitiatives/resources/nodes#show-hide-details">Nodes</a></li><li><a href="../../../workloads-in-nvidia-run-ai/workloads#metrics">Workloads per pod</a></li></ul> |
| `GPU_TENSOR_ACTIVITY_PER_GPU`              | [Pods](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics) | GPU tensor activity      | <ul><li><a href="../../aiinitiatives/resources/nodes#show-hide-details">Nodes</a></li><li><a href="../../../workloads-in-nvidia-run-ai/workloads#metrics">Workloads per pod</a></li></ul> |

## Supported Telemetry

| Metric                              | Applicable API endpoint                                                                                                                                                                                                                                                                                                                                                                                                                              | Metric name in UI              | Applicable UI table                                                                                                                                                                                                      |
| ----------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `WORKLOADS_COUNT`                   | [Workloads](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-telemetry)                                                                                                                                                                                                                                                                                                                                       |                                |                                                                                                                                                                                                                          |
| `ALLOCATED_GPUS`                    | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                               | Allocated GPUs                 | [Nodes](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/aiinitiatives/resources/nodes)                                                                                                               |
| `GPU_allocation`                    | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/workloads/workloads#get-api-v1-workloads-telemetry">Workloads</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul> |                                |                                                                                                                                                                                                                          |
| `READY_GPU_NODES`                   | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                               | Ready / Total GPU nodes        | [Overview dashboard](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/before-you-start#ui-views)                                                                                                      |
| `READY_GPUS`                        | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                               | Ready / Total GPU devices      | [Overview dashboard](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/before-you-start#ui-views)                                                                                                      |
| `TOTAL_GPU_NODES`                   | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                               | Ready / Total GPU nodes        | [Overview dashboard](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/before-you-start#ui-views)                                                                                                      |
| `TOTAL_GPUS`                        | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                               | Ready / Total GPU devices      | [Overview dashboard](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/before-you-start#ui-views)                                                                                                      |
| `IDLE_ALLOCATED_GPUS`               | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                               | Idle allocated GPU devices     | [Overview dashboard](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/before-you-start#ui-views)                                                                                                      |
| `FREE_GPUS`                         | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                               | Free GPU devices               | [Nodes](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/aiinitiatives/resources/nodes)                                                                                                               |
| `TOTAL_CPU_CORES`                   | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                               | CPU (Cores)                    | [Nodes](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/aiinitiatives/resources/nodes)                                                                                                               |
| `USED_CPU_CORES`                    | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                               |                                |                                                                                                                                                                                                                          |
| `ALLOCATED_CPU_CORES`               | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry">Nodes</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>         | <p>Allocated CPU cores<br></p> | [Nodes](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/aiinitiatives/resources/nodes)                                                                                                               |
| `TOTAL_GPU_MEMORY_BYTES`            | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                               | GPU memory                     | [Nodes](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/aiinitiatives/resources/nodes)                                                                                                               |
| `USED_GPU_MEMORY_BYTES`             | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                               | Used GPU memory                | [Nodes](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/aiinitiatives/resources/nodes)                                                                                                               |
| `TOTAL_CPU_MEMORY_BYTES`            | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                               | CPU memory                     | [Nodes](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/aiinitiatives/resources/nodes)                                                                                                               |
| `USED_CPU_MEMORY_BYTES`             | [Nodes](https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                               | Used CPU memory                | [Nodes](https://run-ai-docs.nvidia.com/self-hosted/2.20/platform-management/aiinitiatives/resources/nodes)                                                                                                               |
| `ALLOCATED_CPU_MEMORY_BYTES`        | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/nodes#get-api-v1-nodes-telemetry">Nodes</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>         | Allocated CPU memory           | <ul><li><a href="../aiinitiatives/resources/nodes">Nodes</a></li><li><a href="../aiinitiatives/organization/projects">Projects</a></li><li><a href="../aiinitiatives/organization/departments">Departments</a></li></ul> |
| `GPU_QUOTA`                         | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>                                                                                                                                   | GPU quota                      | <ul><li><a href="../aiinitiatives/organization/projects">Projects</a></li><li><a href="../aiinitiatives/organization/departments">Departments</a></li></ul>                                                              |
| `CPU_QUOTA`                         | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>                                                                                                                                   |                                |                                                                                                                                                                                                                          |
| `MEMORY_QUOTA`                      | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>                                                                                                                                   |                                |                                                                                                                                                                                                                          |
| `GPU_ALLOCATION_NON_PREEMPTIBLE`    | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>                                                                                                                                   |                                |                                                                                                                                                                                                                          |
| `CPU_ALLOCATION_NON_PREEMPTIBLE`    | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>                                                                                                                                   |                                |                                                                                                                                                                                                                          |
| `MEMORY_ALLOCATION_NON_PREEMPTIBLE` | <ul><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://app.gitbook.com/s/b5QLzc5pV7wpXz3CDYyp/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>                                                                                                                                   |                                |                                                                                                                                                                                                                          |
