Quick Start for Infrastructure Administrators

This guide is for infrastructure administrators responsible for installing, configuring, and operating NVIDIA Run:ai.

The quick start walks through the initial infrastructure setup lifecycle, including platform installation and the essential post-installation configuration required to prepare the cluster for onboarding and workload execution. It focuses on infrastructure-level concerns such as cluster readiness, control plane behavior, security boundaries, and operational stability.

Prerequisites

Before you begin, ensure that:

  • A Kubernetes cluster is up and running.

  • Helmarrow-up-right 3.14 or later is installed.

  • You have kubectl access to the cluster with admin-level permissions.

Installation

To install using Helm charts, see Install using Helm. This is the standard installation method which provides full control and flexibility over configuration and deployment.

Post Installation Infrastructure Setup

After installing NVIDIA Run:ai, complete the following foundational infrastructure configuration steps to ensure the platform is production-ready and can safely support organizational onboarding and workloads. These steps focus on cluster readiness, control plane behavior, and operational guardrails, rather than day-to-day platform usage:

  • Validate node readiness and assign node roles as required

  • Configure advanced control plane and cluster settings based on your environment requirements

  • Enable required integrations and networking components

  • Apply security and operational best practices

  • Prepare the platform for scale, availability, and ongoing maintenance

The exact configuration required depends on your environment, scale, and operational model. Detailed procedures and advanced options are documented in the Advanced setup and Infrastructure procedures sections.

Last updated