Service mesh
NVIDIA Run:ai supports service mesh implementations. When a service mesh is deployed with sidecar injection, specific configurations must be applied to ensure compatibility with NVIDIA Run:ai. This document outlines the required changes for the NVIDIA Run:ai control plane and cluster.
Control plane configuration
By default, NVIDIA Run:ai prevents Istio from injecting sidecar containers into system jobs in the control plane. For other service mesh solutions, users must manually add annotations during installation.
To disable sidecar injection in the NVIDIA Run:ai control plane, modify the Helm values file by adding the required pod labels to the following components. See Advanced control plane configurations for more details.
Example for Open Service Mesh:
Cluster configuration
Installation phase
Sidecar containers injected by some service mesh solutions can prevent NVIDIA Run:ai installation hooks from completing. To avoid this, modify the Helm installation command to include the required labels or annotations:
Example for Istio Service Mesh:
Workloads
To prevent sidecar injection in workloads created at runtime (such as training workloads), update the runaiconfig
resource. See Advanced cluster configurations for more details:
Last updated