1 of 1

Overview

NVIDIA Run:ai is a GPU orchestration and optimization platform that helps organizations maximize compute utilization for AI workloads. By optimizing the use of expensive compute resources, NVIDIA Run:ai accelerates AI development cycles, and drives faster time-to-market for AI-powered innovations.

Built on Kubernetes, NVIDIA Run:ai supports dynamic GPU allocation, workload submission, workload scheduling, and resource sharing, ensuring that AI teams get the compute power they need while IT teams maintain control over infrastructure efficiency.

How NVIDIA Run:ai Helps Your Organization

For Infrastructure Administrators

NVIDIA Run:ai centralizes cluster management and optimizes infrastructure control by offering:

- Manage all clusters from a single platform, ensuring consistency and control across environments.
- Gain real-time and historical insights into GPU consumption across clusters to optimize resource allocation and plan future capacity needs efficiently.
- Define and enforce security and usage policies to align GPU consumption with business and compliance requirements.

For Platform Administrators

NVIDIA Run:ai simplifies AI infrastructure management by providing a structured approach to managing AI initiatives, resources, and user access. It enables platform administrators maintain control, efficiency, and scalability across their infrastructure:

- Map and set up AI initiatives according to your organization's structure, ensuring clear resource allocation.
- Enable seamless sharing and pooling of GPUs across multiple users, reducing idle time and optimizing utilization.
- Assign users (AI practitioners, ML engineers) to specific projects and departments to manage access and enforce security policies, utilizing role-based access control (RBAC) to ensure permissions align with user roles.

For AI Practitioners

NVIDIA Run:ai empowers data scientists and ML engineers by providing:

- Ensure high-priority jobs get GPU resources. Workloads dynamically receive resources based on demand.
- Request and utilize only a fraction of a GPU's memory, ensuring efficient resource allocation and leaving room for other workloads.
- Run your entire AI initiatives lifecycle – Jupyter Notebooks, training jobs, and inference workloads efficiently.

NVIDIA Run:ai System Components

NVIDIA Run:ai is made up of two components both installed over a cluster:

NVIDIA Run:ai cluster - Provides scheduling and workload management, extending Kubernetes native capabilities.
NVIDIA Run:ai control plane - Provides resource management, handles workload submission and provides cluster monitoring and analytics.

NVIDIA Run:ai Cluster

The NVIDIA Run:ai cluster is responsible for scheduling AI workloads and efficiently allocating GPU resources across users and projects:

- Applies AI-aware rules to efficiently schedule workloads submitted by AI practitioners.
- Handles workload management which includes the researcher code running as a Kubernetes container and the system resources required to run the code, such as storage, credentials, network endpoints to access the container and so on.
- Installed as a Kubernetes Operator to automate deployment, upgrades and configuration of NVIDIA Run:ai cluster services.

NVIDIA Run:ai Control Plane

The NVIDIA Run:ai control plane provides a centralized management interface for organizations to oversee their GPU infrastructure across multiple locations/subnets, accessible via Web UI, and . The control plane can be deployed on the cloud or on-premise for organizations that require local control over their infrastructure (self-hosted).

- Manages multiple NVIDIA Run:ai clusters for a single tenant across different locations and subnets from a single unified interface.
- Allows administrators to define Projects, Departments and user roles, enforcing policies for fair resource distribution.
- Allows teams to submit workloads, track usage, and monitor GPU performance in real time.

Installation Types

There are two main installation options:

Installation Type

Description

Overview

How NVIDIA Run:ai Helps Your Organization

For Infrastructure Administrators

NVIDIA Run:ai centralizes cluster management and optimizes infrastructure control by offering:

- Manage all clusters from a single platform, ensuring consistency and control across environments.
- Gain real-time and historical insights into GPU consumption across clusters to optimize resource allocation and plan future capacity needs efficiently.
- Define and enforce security and usage policies to align GPU consumption with business and compliance requirements.

For Platform Administrators

- Map and set up AI initiatives according to your organization's structure, ensuring clear resource allocation.
- Enable seamless sharing and pooling of GPUs across multiple users, reducing idle time and optimizing utilization.
- Assign users (AI practitioners, ML engineers) to specific projects and departments to manage access and enforce security policies, utilizing role-based access control (RBAC) to ensure permissions align with user roles.

For AI Practitioners

NVIDIA Run:ai empowers data scientists and ML engineers by providing:

- Ensure high-priority jobs get GPU resources. Workloads dynamically receive resources based on demand.
- Request and utilize only a fraction of a GPU's memory, ensuring efficient resource allocation and leaving room for other workloads.
- Run your entire AI initiatives lifecycle – Jupyter Notebooks, training jobs, and inference workloads efficiently.

NVIDIA Run:ai System Components

NVIDIA Run:ai is made up of two components both installed over a cluster:

NVIDIA Run:ai cluster - Provides scheduling and workload management, extending Kubernetes native capabilities.
NVIDIA Run:ai control plane - Provides resource management, handles workload submission and provides cluster monitoring and analytics.

NVIDIA Run:ai Cluster

The NVIDIA Run:ai cluster is responsible for scheduling AI workloads and efficiently allocating GPU resources across users and projects:

- Applies AI-aware rules to efficiently schedule workloads submitted by AI practitioners.
- Handles workload management which includes the researcher code running as a Kubernetes container and the system resources required to run the code, such as storage, credentials, network endpoints to access the container and so on.
- Installed as a Kubernetes Operator to automate deployment, upgrades and configuration of NVIDIA Run:ai cluster services.

NVIDIA Run:ai Control Plane

- Manages multiple NVIDIA Run:ai clusters for a single tenant across different locations and subnets from a single unified interface.
- Allows administrators to define Projects, Departments and user roles, enforcing policies for fair resource distribution.
- Allows teams to submit workloads, track usage, and monitor GPU performance in real time.

Installation Types

There are two main installation options:

Installation Type

Description

Overview

hashtagHow NVIDIA Run:ai Helps Your Organization

hashtagFor Infrastructure Administrators

hashtagFor Platform Administrators

hashtagFor AI Practitioners

hashtagNVIDIA Run:ai System Components

hashtagNVIDIA Run:ai Cluster

hashtagNVIDIA Run:ai Control Plane

hashtagInstallation Types

Overview

hashtagHow NVIDIA Run:ai Helps Your Organization

hashtagFor Infrastructure Administrators

hashtagFor Platform Administrators

hashtagFor AI Practitioners

hashtagNVIDIA Run:ai System Components

hashtagNVIDIA Run:ai Cluster

hashtagNVIDIA Run:ai Control Plane

hashtagInstallation Types

How NVIDIA Run:ai Helps Your Organization

For Infrastructure Administrators

For Platform Administrators

For AI Practitioners

NVIDIA Run:ai System Components

NVIDIA Run:ai Cluster

NVIDIA Run:ai Control Plane

Installation Types

How NVIDIA Run:ai Helps Your Organization

For Infrastructure Administrators

For Platform Administrators

For AI Practitioners

NVIDIA Run:ai System Components

NVIDIA Run:ai Cluster

NVIDIA Run:ai Control Plane

Installation Types