# Metrics and Telemetry

Metrics are numeric measurements recorded **over time** that are emitted from the NVIDIA Run:ai cluster and telemetry is a numeric measurement recorded in real-time when emitted from the NVIDIA Run:ai cluster.

## Scopes

NVIDIA Run:ai provides control-plane API which supports and aggregates analytics at various levels.

| Level      | Description                                                                                                                                                                                           |
| ---------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Cluster    | A cluster is a set of nodes pools and nodes. With Cluster metrics, metrics are aggregated at the Cluster level. In the NVIDIA Run:ai user interface, metrics are available in the Overview dashboard. |
| Node       | Data is aggregated at the node level.                                                                                                                                                                 |
| Node pool  | Data is aggregated at the node pool level.                                                                                                                                                            |
| Workload   | Data is aggregated at the workload level. In some workloads, e.g. with distributed workloads, these metrics aggregate data from all worker pods.                                                      |
| Pod        | The basic unit of execution.                                                                                                                                                                          |
| Project    | The basic organizational unit. Projects are the tool to implement resource allocation policies as well as the segregation between different initiatives.                                              |
| Department | Departments are a grouping of projects.                                                                                                                                                               |

## Supported Metrics

| Metric name in API               | Applicable API endpoint                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Metric name in UI per grid                                           | Applicable UI grid                                                                                                                                                                                                                               |
| -------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `ALLOCATED_GPU`                  | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                       | <ul><li>GPU devices (allocated)</li><li>Allocated GPUs</li></ul>     | <ul><li><a href="/pages/AKfVOD6e5KaNNu3FhjWv#ui-views">Overview dashboard</a></li><li><a href="/pages/3mRZ5kWeocuSsGQCyj6L#show-hide-details">Node pools</a></li></ul>                                                                           |
| `AVG_WORKLOAD_WAIT_TIME`         | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                       |                                                                      |                                                                                                                                                                                                                                                  |
| `CPU_LIMIT_CORES`                | [Workloads](https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | CPU limit                                                            | [Workloads](/self-hosted/2.22/workloads-in-nvidia-run-ai/workloads.md#metrics)                                                                                                                                                                   |
| `CPU_MEMORY_LIMIT_BYTES`         | [Workloads](https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | CPU memory limit                                                     | [Workloads](/self-hosted/2.22/workloads-in-nvidia-run-ai/workloads.md#metrics)                                                                                                                                                                   |
| `CPU_MEMORY_REQUEST_BYTES`       | [Workloads](https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | CPU memory request                                                   | [Workloads](/self-hosted/2.22/workloads-in-nvidia-run-ai/workloads.md#metrics)                                                                                                                                                                   |
| `CPU_MEMORY_USAGE_BYTES`         | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-workloadid-metrics">Workloads</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li></ul>                                                                                                                                                                                                                                                                                                                                    | CPU memory usage                                                     | [Workloads](/self-hosted/2.22/workloads-in-nvidia-run-ai/workloads.md#metrics)                                                                                                                                                                   |
| `CPU_MEMORY_UTILIZATION`         | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul>                                                                                                                                                                               | CPU memory utilization                                               | <ul><li><a href="/pages/AKfVOD6e5KaNNu3FhjWv#ui-views">Overview dashboard</a></li><li><a href="/pages/3mRZ5kWeocuSsGQCyj6L#show-hide-details">Node pools</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul> |
| `CPU_REQUEST_CORES`              | [Workloads](https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | CPU request                                                          | [Workloads](/self-hosted/2.22/workloads-in-nvidia-run-ai/workloads.md)                                                                                                                                                                           |
| `CPU_USAGE_CORES`                | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-workloadid-metrics">Workloads</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li></ul>                                                                                                                                                                                                            | CPU usage                                                            | [Workloads](/self-hosted/2.22/workloads-in-nvidia-run-ai/workloads.md#metrics)                                                                                                                                                                   |
| `CPU_UTILIZATION`                | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul>                                                                                                                                                                               | <ul><li>CPU compute utilization</li><li>CPU utilization</li></ul>    | <ul><li><a href="/pages/AKfVOD6e5KaNNu3FhjWv#ui-views">Overview dashboard</a> and <a href="/pages/3mRZ5kWeocuSsGQCyj6L#show-hide-details">Node pools</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul>     |
| `GPU_ALLOCATION`                 | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-workloadid-metrics">Workloads</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-departmentid-metrics">Departments</a></li></ul>                                                                                                                                                                   | GPU devices (allocated)                                              | [Overview dashboard](/self-hosted/2.22/platform-management/monitor-performance/before-you-start.md#ui-views)                                                                                                                                     |
| `GPU_MEMORY_REQUEST_BYTES`       | [Workloads](https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | GPU memory request                                                   | [Workloads](/self-hosted/2.22/workloads-in-nvidia-run-ai/workloads.md#metrics)                                                                                                                                                                   |
| `GPU_MEMORY_USAGE_BYTES`         | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-workloadid-metrics">Workloads</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul>                                                                                                                                                                                                            | GPU memory usage                                                     | [Workloads](/self-hosted/2.22/workloads-in-nvidia-run-ai/workloads.md#metrics)                                                                                                                                                                   |
| `GPU_MEMORY_USAGE_BYTES_PER_GPU` | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li></ul>                                                                                                                                                                                                                                                                                                                                                | GPU memory usage per GPU                                             | [Workloads per pod](/self-hosted/2.22/workloads-in-nvidia-run-ai/workloads.md#metrics)                                                                                                                                                           |
| `GPU_MEMORY_UTILIZATION`         | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                       | GPU memory utilization                                               | <ul><li><a href="/pages/AKfVOD6e5KaNNu3FhjWv#ui-views">Overview dashboard</a></li><li><a href="/pages/3mRZ5kWeocuSsGQCyj6L#show-hide-details">Node pools</a></li></ul>                                                                           |
| `GPU_MEMORY_UTILIZATION_PER_GPU` | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | GPU memory utilization per GPU                                       | [Nodes](/self-hosted/2.22/platform-management/aiinitiatives/resources/nodes.md#show-hide-details)                                                                                                                                                |
| `GPU_QUOTA`                      | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-departmentid-metrics">Departments</a></li></ul> | Quota                                                                | [Quota management](/self-hosted/2.22/platform-management/monitor-performance/before-you-start.md#ui-views)                                                                                                                                       |
| `GPU_UTILIZATION`                | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-workloadid-metrics">Workloads</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li></ul>                              | GPU compute utilization                                              | <ul><li><a href="/pages/AKfVOD6e5KaNNu3FhjWv#ui-views">Overview dashboard</a></li><li><a href="/pages/3mRZ5kWeocuSsGQCyj6L#show-hide-details">Node pools</a></li><li><a href="/pages/UN0PmRIvSdWvwAIdpotf#metrics">Workloads</a></li></ul>       |
| `GPU_UTILIZATION_PER_GPU`        | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li></ul>                                                                                                                                                                                                                                                                                                                                                | GPU utilization per GPU                                              | [Nodes](/self-hosted/2.22/platform-management/aiinitiatives/resources/nodes.md#show-hide-details)                                                                                                                                                |
| `TOTAL_GPU`                      | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                       | <ul><li>GPU devices total</li><li>Total GPUs</li></ul>               | <ul><li><a href="/pages/AKfVOD6e5KaNNu3FhjWv#ui-views">Overview dashboard</a></li><li><a href="/pages/3mRZ5kWeocuSsGQCyj6L#show-hide-details">Node pools</a></li></ul>                                                                           |
| `TOTAL_GPU_NODES`                | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                       |                                                                      |                                                                                                                                                                                                                                                  |
| `GPU_UTILIZATION_DISTRIBUTION`   | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                       | GPU utilization distribution                                         | [Node pools](/self-hosted/2.22/platform-management/aiinitiatives/resources/node-pools.md#show-hide-details)                                                                                                                                      |
| `UNALLOCATED_GPU`                | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/clusters#get-api-v1-clusters-clusteruuid-metrics">Clusters</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodepools#get-api-v1-clusters-clusteruuid-nodepools-nodepoolname-metrics">Node pools</a></li></ul>                                                                                                                                                                                                                                                                                                       | <ul><li>GPU devices (unallocated)</li><li>Unallocated GPUs</li></ul> | <ul><li><a href="/pages/AKfVOD6e5KaNNu3FhjWv#ui-views">Overview dashboard</a></li><li><a href="/pages/3mRZ5kWeocuSsGQCyj6L#show-hide-details">Node pools</a></li></ul>                                                                           |
| `CPU_QUOTA_MILLICORES`           | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-departmentid-metrics">Departments</a></li></ul>                                                                                                                                                                                                                                                                                                       |                                                                      |                                                                                                                                                                                                                                                  |
| `CPU_MEMORY_QUOTA_MB`            | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-departmentid-metrics">Departments</a></li></ul>                                                                                                                                                                                                                                                                                                       |                                                                      |                                                                                                                                                                                                                                                  |
| `CPU_ALLOCATION_MILLICORES`      | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-departmentid-metrics">Departments</a></li></ul>                                                                                                                                                                                                                                                                                                       |                                                                      |                                                                                                                                                                                                                                                  |
| `CPU_MEMORY_ALLOCATION_MB`       | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-departmentid-metrics">Departments</a></li></ul>                                                                                                                                                                                                                                                                                                       |                                                                      |                                                                                                                                                                                                                                                  |
| `POD_COUNT`                      | [Workloads](https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                      |                                                                                                                                                                                                                                                  |
| `RUNNING_POD_COUNT`              | [Workloads](https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-workloadid-metrics)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                      |                                                                                                                                                                                                                                                  |

### Advanced Metrics

NVIDIA provides extended metrics as shown [here](https://docs.nvidia.com/datacenter/dcgm/latest/user-guide/feature-overview.html#profiling-metrics).

{% hint style="info" %}
**Note**

Advanced metrics are disabled by default. If unavailable, your Administrator must enable it under **General Settings** → Analytics → Advanced metrics. Before enabling, the administrator must configure GPU profiling through the DCGM Exporter and NVIDIA Run:ai Prometheus integration. For configuration steps, see [Advanced metrics](/self-hosted/2.22/platform-management/monitor-performance/advanced-metrics.md).
{% endhint %}

| Metric name in API                          | Applicable API endpoint                                                                                                                                                                                                                                                | Metric name in UI             | Applicable UI table                                                                                                                                               |
| ------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `GPU_FP16_ENGINE_ACTIVITY_PER_GPU`          | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul> | GPU FP16 engine activity      | <ul><li><a href="/pages/UN0PmRIvSdWvwAIdpotf#show-hide-details">Workloads</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul> |
| `GPU_FP32_ENGINE_ACTIVITY_PER_GPU`          | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul> | GPU FP32 engine activity      | <ul><li><a href="/pages/UN0PmRIvSdWvwAIdpotf#show-hide-details">Workloads</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul> |
| `GPU_FP64_ENGINE_ACTIVITY_PER_GPU`          | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul> | GPU FP64 engine activity      | <ul><li><a href="/pages/UN0PmRIvSdWvwAIdpotf#show-hide-details">Workloads</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul> |
| `GPU_GRAPHICS_ENGINE_ACTIVITY_PER_GPU`      | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul> | Graphics engine activity      | <ul><li><a href="/pages/UN0PmRIvSdWvwAIdpotf#show-hide-details">Workloads</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul> |
| `GPU_MEMORY_BANDWIDTH_UTILIZATION_PER_GPU`  | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul> | Memory bandwidth utilization  | <ul><li><a href="/pages/UN0PmRIvSdWvwAIdpotf#show-hide-details">Workloads</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul> |
| `GPU_NVLINK_RECEIVED_BANDWIDTH_PER_GPU`     | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul> | NVLink received bandwidth     | <ul><li><a href="/pages/UN0PmRIvSdWvwAIdpotf#show-hide-details">Workloads</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul> |
| `GPU_NVLINK_TRANSMITTED_BANDWIDTH_PER_GPU`  | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul> | NVLink transmitted bandwidth  | <ul><li><a href="/pages/UN0PmRIvSdWvwAIdpotf#show-hide-details">Workloads</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul> |
| `GPU_PCIE_RECEIVED_BANDWIDTH_PER_GPU`       | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul> | PCIe received bandwidth       | <ul><li><a href="/pages/UN0PmRIvSdWvwAIdpotf#show-hide-details">Workloads</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul> |
| `GPU_PCIE_TRANSMITTED_BANDWIDTH_PER_GPU`    | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul> | PCIe transmitted bandwidth    | <ul><li><a href="/pages/UN0PmRIvSdWvwAIdpotf#show-hide-details">Workloads</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul> |
| `GPU_SM_ACTIVITY_PER_GPU`                   | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul> | GPU SM activity               | <ul><li><a href="/pages/UN0PmRIvSdWvwAIdpotf#show-hide-details">Workloads</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul> |
| `GPU_SM_OCCUPANCY_PER_GPU`                  | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul> | GPU SM occupancy              | <ul><li><a href="/pages/UN0PmRIvSdWvwAIdpotf#show-hide-details">Workloads</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul> |
| `GPU_TENSOR_ACTIVITY_PER_GPU`               | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics">Pods</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics">Nodes</a></li></ul> | GPU tensor activity           | <ul><li><a href="/pages/UN0PmRIvSdWvwAIdpotf#show-hide-details">Workloads</a></li><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr#show-hide-details">Nodes</a></li></ul> |
| `GPU_OOMKILL_SWAP_OUT_OF_RAM_COUNT_PER_GPU` | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics)                                                                                                                                                                   | OOMKill swap out of RAM count | [Nodes](/self-hosted/2.22/platform-management/aiinitiatives/resources/nodes.md#show-hide-details)                                                                 |
| `GPU_OOMKILL_BURST_COUNT_PER_GPU`           | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics)                                                                                                                                                                   | OOMKill burst count           | [Nodes](/self-hosted/2.22/platform-management/aiinitiatives/resources/nodes.md#show-hide-details)                                                                 |
| `GPU_OOMKILL_IDLE_COUNT_PER_GPU`            | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-nodeid-metrics)                                                                                                                                                                   | OOMKill idle count            | [Nodes](/self-hosted/2.22/platform-management/aiinitiatives/resources/nodes.md#show-hide-details)                                                                 |
| `GPU_SWAP_MEMORY_BYTES_PER_GPU`             | [Pods](https://run-ai-docs.nvidia.com/api/2.22/workloads/pods#get-api-v1-workloads-workloadid-pods-podid-metrics)                                                                                                                                                      | GPU swap memory               | [Workloads](/self-hosted/2.22/workloads-in-nvidia-run-ai/workloads.md#show-hide-details)                                                                          |

## Supported Telemetry

| Metric                              | Applicable API endpoint                                                                                                                                                                                                                                                                                                                                                                                                         | Metric name in UI              | Applicable UI table                                                                                                                                                                        |
| ----------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `WORKLOADS_COUNT`                   | [Workloads](https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-telemetry)                                                                                                                                                                                                                                                                                                                         |                                |                                                                                                                                                                                            |
| `ALLOCATED_GPUS`                    | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                 | Allocated GPUs                 | [Nodes](/self-hosted/2.22/platform-management/aiinitiatives/resources/nodes.md)                                                                                                            |
| `GPU_allocation`                    | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/workloads/workloads#get-api-v1-workloads-telemetry">Workloads</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul> |                                |                                                                                                                                                                                            |
| `READY_GPU_NODES`                   | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                 | Ready / Total GPU nodes        | [Overview dashboard](/self-hosted/2.22/platform-management/monitor-performance/before-you-start.md#ui-views)                                                                               |
| `READY_GPUS`                        | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                 | Ready / Total GPU devices      | [Overview dashboard](/self-hosted/2.22/platform-management/monitor-performance/before-you-start.md#ui-views)                                                                               |
| `TOTAL_GPU_NODES`                   | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                 | Ready / Total GPU nodes        | [Overview dashboard](/self-hosted/2.22/platform-management/monitor-performance/before-you-start.md#ui-views)                                                                               |
| `TOTAL_GPUS`                        | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                 | Ready / Total GPU devices      | [Overview dashboard](/self-hosted/2.22/platform-management/monitor-performance/before-you-start.md#ui-views)                                                                               |
| `IDLE_ALLOCATED_GPUS`               | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                 | Idle allocated GPU devices     | [Overview dashboard](/self-hosted/2.22/platform-management/monitor-performance/before-you-start.md#ui-views)                                                                               |
| `FREE_GPUS`                         | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                 | Free GPU devices               | [Nodes](/self-hosted/2.22/platform-management/aiinitiatives/resources/nodes.md)                                                                                                            |
| `TOTAL_CPU_CORES`                   | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                 | CPU (Cores)                    | [Nodes](/self-hosted/2.22/platform-management/aiinitiatives/resources/nodes.md)                                                                                                            |
| `USED_CPU_CORES`                    | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                 |                                |                                                                                                                                                                                            |
| `ALLOCATED_CPU_CORES`               | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry">Nodes</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>         | <p>Allocated CPU cores<br></p> | [Nodes](/self-hosted/2.22/platform-management/aiinitiatives/resources/nodes.md)                                                                                                            |
| `TOTAL_GPU_MEMORY_BYTES`            | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                 | GPU memory                     | [Nodes](/self-hosted/2.22/platform-management/aiinitiatives/resources/nodes.md)                                                                                                            |
| `USED_GPU_MEMORY_BYTES`             | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                 | Used GPU memory                | [Nodes](/self-hosted/2.22/platform-management/aiinitiatives/resources/nodes.md)                                                                                                            |
| `TOTAL_CPU_MEMORY_BYTES`            | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                 | CPU memory                     | [Nodes](/self-hosted/2.22/platform-management/aiinitiatives/resources/nodes.md)                                                                                                            |
| `USED_CPU_MEMORY_BYTES`             | [Nodes](https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry)                                                                                                                                                                                                                                                                                                                                 | Used CPU memory                | [Nodes](/self-hosted/2.22/platform-management/aiinitiatives/resources/nodes.md)                                                                                                            |
| `ALLOCATED_CPU_MEMORY_BYTES`        | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/nodes#get-api-v1-nodes-telemetry">Nodes</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>         | Allocated CPU memory           | <ul><li><a href="/pages/TJ2fVx3mVH4dgeKdNKnr">Nodes</a></li><li><a href="/pages/onqeIJP7FGHHTVpHBaPe">Projects</a></li><li><a href="/pages/s89HBzC5TlksOycyJ7Um">Departments</a></li></ul> |
| `GPU_QUOTA`                         | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>                                                                                                                            | GPU quota                      | <ul><li><a href="/pages/onqeIJP7FGHHTVpHBaPe">Projects</a></li><li><a href="/pages/s89HBzC5TlksOycyJ7Um">Departments</a></li></ul>                                                         |
| `CPU_QUOTA`                         | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>                                                                                                                            |                                |                                                                                                                                                                                            |
| `MEMORY_QUOTA`                      | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>                                                                                                                            |                                |                                                                                                                                                                                            |
| `GPU_ALLOCATION_NON_PREEMPTIBLE`    | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>                                                                                                                            |                                |                                                                                                                                                                                            |
| `CPU_ALLOCATION_NON_PREEMPTIBLE`    | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>                                                                                                                            |                                |                                                                                                                                                                                            |
| `MEMORY_ALLOCATION_NON_PREEMPTIBLE` | <ul><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/projects#get-api-v1-org-unit-projects-projectid-metrics">Projects</a></li><li><a href="https://run-ai-docs.nvidia.com/api/2.22/organizations/departments#get-api-v1-org-unit-departments-telemetry">Departments</a></li></ul>                                                                                                                            |                                |                                                                                                                                                                                            |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://run-ai-docs.nvidia.com/self-hosted/2.22/platform-management/monitor-performance/metrics.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
