NVIDIA Run:ai Self-Hosted Product Documentation

Cover

Install, set up and monitor

Install self-hosted

Set authenticated access

Set node roles and advanced cluster configurations

Monitor, manage and restore clusters

Monitor your platform

Cover

Manage organizations and resources

Map and set up your organizations

Set up and assign your resources

Manage permissions

Create and manage policies

Monitor performance and health

Cover

Build, train and deploy models

Learn more about workloads and workload types

Prepare workload assets

Build your model using workspaces

Train your model using standard or distributed training workloads

Deploy your model with inference workloads

Cover

Scheduling and resource optimization

Learn the NVIDIA Run:ai Scheduler concepts and principles

Understand more about how the Scheduler works

Explore different resource optimizations

Cover

Use the CLI

Install and configure the CLI

See the full list of commands and examples

Cover

Quick starts

Run Jupyter Notebook using workspaces

Run your first distributed training workload

Launch workloads with dynamic GPU fractions

Last updated 11 days ago