Supported Features

This page compares feature support across different workload types in NVIDIA Run:ai. Use it to understand which scheduling, resource management, and platform capabilities are available for each workload type before selecting a workload model or submission method.

Feature availability may vary across NVIDIA Run:ai versions and cluster deployments. Refer to this page and the linked documentation for the most up-to-date support details.

Workload Submission

Workspace
Standard Training
Distributed Training
Inference
Distributed Inference
Supported workload types

UI

UI (via YAML)

API (Workloads v1)

API (Workloads v2)

CLI

Scheduling and Resource Management

Functionality
Workspace
Standard Training
Distributed Training
Inference
Distributed Inference
Supported workload types

Elastic scaling

Operational and Platform Features

Functionality
Workspace
Standard Training
Distributed Training
Inference
Distributed Inference
Supported workload types

Workload awareness

circle-info

Workload awareness

Specific workload-aware visibility, so that different pods are identified and treated as a single workload (for example GPU utilization, workload view, dashboards).

Externally Submitted Kubernetes Workloads

Kubernetes workloads can be submitted outside of NVIDIA Run:ai, for example by using kubectl directly or Helm charts as part of an AI application. These workloads are scheduled by NVIDIA Run:ai and receive full monitoring support, along with a subset of scheduling capabilities.

Last updated