Customized Installation

This section explains the available configurations for customizing the NVIDIA Run:ai control plane and cluster installation.

Control Plane Helm Chart Values

The NVIDIA Run:ai control plane installation can be customized to support your environment via Helm values files or Helm install flags. See Advanced control plane configurations.

Cluster Helm Chart Values

The NVIDIA Run:ai cluster installation can be customized to support your environment via Helm values files or Helm install flags.

These configurations are saved in the runaiconfig Kubernetes object and can be edited post-installation as needed. For more information, see Advanced cluster configurations.

The following table lists the available Helm chart values that can be configured to customize the NVIDIA Run:ai cluster installation.

Key

Description

global.image.registry (string)

Global Docker image registry Default: ""

global.additionalImagePullSecrets (list)

List of image pull secrets references Default: []

global.additionalJobLabels (object)

Set NVIDIA Run:ai and 3rd party services' Pod Labels in a format of key/value pairs. Default: ""

global.additionalJobAnnotations (object)

Set NVIDIA Run:ai and 3rd party services' Annotations in a format of key/value pairs. Default: ""

spec.researcherService.ingress.tlsSecret (string)

Existing secret key where cluster TLS certificates are stored (non-OpenShift) Default: runai-cluster-domain-tls-secret

spec.researcherService.route.tlsSecret (string)

Existing secret key where cluster TLS certificates are stored (OpenShift only) Default: ""

spec.prometheus.spec.image (string)

Due to a known issue In the Prometheus Helm chart, the imageRegistry setting is ignored. To pull the image from a different registry, you can manually specify the Prometheus image reference. Default: quay.io/prometheus/prometheus

spec.prometheus.spec.imagePullSecrets (string)

List of image pull secrets references in the runai namespace to use for pulling Prometheus images (relevant for air-gapped installations). Default: []

global.customCA.enabled

Enables the use of a custom Certificate Authority (CA) in your deployment. When set to true, the system is configured to trust a user-provided CA certificate for secure communication.

openShift.securityContextConstraints.create

Enables the deployment of Security Context Constraints (SCC). Disable for CIS compliance. Default: true

global.tolerations (list)

Configure Kubernetes tolerations for NVIDIA Run:ai system-level services

global.affinity (object)

Sets the system nodes where NVIDIA Run:ai system-level services are scheduled. Using global.affinity will overwrite the node roles set using the Administrator CLI (runai-adm). Default: Prefer to schedule on nodes that are labeled with node-role.kubernetes.io/runai-system

PreviousInstall Using Helm NextUpgrade

Last updated 3 days ago