1 of 2

Integrations

Integration Support

Support for third-party integrations varies. When noted below, the integration is supported out of the box with NVIDIA Run:ai. For other integrations, our Customer Success team has prior experience assisting customers with setup. In many cases, the NVIDIA Enterprise Support Portal may include additional reference documentation provided on an as-is basis.

Tool

Kubernetes Workloads Integration

Kubernetes has several built-in resources that encapsulate running Pods. These are called and should not be confused with .

Examples of such resources are a Deployment that manages a stateless application, or a Job that runs tasks to completion.

A NVIDIA Run:ai workload encapsulates all the resources needed to run and creates/deletes them together. Since NVIDIA Run:ai is an open platform, it allows the scheduling of any Kubernetes Workflow.

For more information, see .

Interworking with Karpenter

Karpenter is an open-source, Kubernetes cluster autoscaler built for cloud deployments. Karpenter optimizes the cloud cost of a customer’s cluster by moving workloads between different node types, consolidating workloads into fewer nodes, using lower-cost nodes where possible, scaling up new nodes when needed, and shutting down unused nodes.

Karpenter’s main goal is cost optimization. Unlike Karpenter, NVIDIA Run:ai’s Scheduler optimizes for fairness and resource utilization. Therefore, there are a few potential friction points when using both on the same cluster.

Friction Points Using Karpenter with NVIDIA Run:ai

Karpenter looks for “unschedulable” pending workloads and may try to scale up new nodes to make those workloads schedulable. However, in some scenarios, these workloads may exceed their quota parameters, and the NVIDIA Run:ai Scheduler will put them into a pending state.
Karpenter is not aware of the NVIDIA Run:ai fractions mechanism and may try to interfere incorrectly.
Karpenter preempts any type of workload (i.e., high-priority, non-preemptible workloads will potentially be interrupted and moved to save cost).
Karpenter has no pod-group (i.e., workload) notion or gang scheduling awareness, meaning that Karpenter is unaware that a set of “arbitrary” pods is a single workload. This may cause Karpenter to schedule those pods into different node pools (in the case of multi-node-pool workloads) or scale up or down a mix of wrong nodes.

Mitigating the Friction Points

NVIDIA Run:ai Scheduler mitigates the friction points using the following techniques (each numbered bullet below corresponds to the related friction point listed above):

Karpenter uses a “nominated node” to recommend a node for the Scheduler. The NVIDIA Run:ai Scheduler treats this as a “preferred” recommendation, meaning it will try to use this node, but it’s not required and it may choose another node.
Fractions - Karpenter won’t consolidate nodes with one or more pods that cannot be moved. The NVIDIA Run:ai reservation pod is marked as ‘do not evict’ to allow the NVIDIA Run:ai Scheduler to control the scheduling of fractions.
Non-preemptible workloads - NVIDIA Run:ai marks non-preemptible workloads as ‘do not evict’ and Karpenter respects this annotation.
NVIDIA Run:ai node pools (single-node-pool workloads) - Karpenter respects the ‘node affinity’ that NVIDIA Run:ai sets on a pod, so Karpenter uses the node affinity for its recommended node. For the gang-scheduling/pod-group (workload) notion, NVIDIA Run:ai Scheduler considers Karpenter directives as preferred recommendations rather than mandatory instructions and overrides Karpenter instructions where appropriate.

Deployment Considerations

Using multi-node-pool workloads
- Workloads may include a list of optional node pools. Karpenter is not aware that only a single node pool should be selected out of that list for the workload. It may therefore recommend putting pods of the same workload into different node pools and may scale up nodes from different node pools to serve a “multi-node-pool” workload instead of nodes on the selected single node pool.
- If this becomes an issue (i.e., if Karpenter scales up the wrong node types), users can set an inter-pod affinity using the node pool label or another common label as a ‘topology’ identifier. This will force Karpenter to choose nodes from a single-node pool per workload, selecting from any of the node pools listed as allowed by the workload.