Customized Installation

This section explains the available configurations for customizing the NVIDIA Run:ai control plane and cluster installation.

Control Plane Helm Chart Values

The NVIDIA Run:ai control plane installation can be customized to support your environment via Helm values filesarrow-up-right or Helm installarrow-up-right flags. See Advanced control plane configurations.

Cluster Helm Chart Values

The NVIDIA Run:ai cluster installation can be customized to support your environment via Helm values filesarrow-up-right or Helm installarrow-up-right flags.

These configurations are saved in the runaiconfig Kubernetes object and can be edited post-installation as needed. For more information, see Advanced cluster configurations.

The following table lists the available Helm chart values that can be configured to customize the NVIDIA Run:ai cluster installation.

Key
Description

global.image.registry (string)

Global Docker image registry Default: ""

global.additionalImagePullSecrets (list)

List of image pull secrets references Default: []

global.additionalJobLabels (object)

Set NVIDIA Run:ai and 3rd party services' Pod Labelsarrow-up-right in a format of key/value pairs Default: ""

global.additionalJobAnnotations (object)

Set NVIDIA Run:ai and 3rd party services' Annotationsarrow-up-right in a format of key/value pairs Default: ""

spec.researcherService.ingress.tlsSecret (string)

Existing secret key where cluster TLS certificates are stored (non-OpenShift) Default: runai-cluster-domain-tls-secret

spec.researcherService.route.tlsSecret (string)

Existing secret key where cluster TLS certificates are stored (OpenShift only) Default: ""

spec.prometheus.spec.image (string)

Due to a known issuearrow-up-right In the Prometheus Helm chart, the imageRegistry setting is ignored. To pull the image from a different registry, you can manually specify the Prometheus image reference. Default: quay.io/prometheus/prometheus

spec.prometheus.spec.imagePullSecrets (string)

List of image pull secrets references in the runai namespace to use for pulling Prometheus images (relevant for air-gapped installations). Default: []

global.customCA.enabled

Enables the use of a custom Certificate Authority (CA) in your deployment. When set to true, the system is configured to trust a user-provided CA certificate for secure communication.

openShift.securityContextConstraints.create

Enables the deployment of Security Context Constraints (SCC). Disable for CIS compliance. Default: true

global.tolerations (list)

Configure Kubernetes tolerationsarrow-up-right for NVIDIA Run:ai system-level services

global.affinity (object)

Sets the system nodes where NVIDIA Run:ai system-level services are scheduled. Using global.affinity will overwrite the node roles set using the Administrator CLI (runai-adm). Default: Prefer to schedule on nodes that are labeled with node-role.kubernetes.io/runai-system

Last updated