Using the Scheduler with Third-Party Workloads

By default, Kubernetes uses its own native scheduler to determine pod placement. The NVIDIA Run:ai platform provides a custom scheduler, runai-scheduler, which is used by default for workloads submitted using the NVIDIA Run:ai platform. This section outlines how to configure third-party workloads, such as those submitted directly to Kubernetes, to run with the NVIDIA Run:ai Scheduler, runai-scheduler, instead of the default Kubernetes scheduler.

Specify the Scheduler in the Workload YAML

To use the NVIDIA Run:ai Scheduler for third-party workloads, specify it in the workload’s YAML file. This instructs Kubernetes to schedule the workload using the NVIDIA Run:ai Scheduler instead of the default one.

spec:schedulerName: runai-scheduler

For example:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    user: test
    gpu-fraction: "0.5"
    gpu-fraction-num-devices: "2"
  labels:
    runai/queue: test
  name: multi-fractional-pod-job
  namespace: test
spec:
  containers:
  - image: gcr.io/run-ai-demo/quickstart-cuda
    imagePullPolicy: Always
    name: job
    env:
    - name: RUNAI_VERBOSE
      value: "1"
    resources:
      limits:
        cpu: 200m
        memory: 200Mi
      requests:
        cpu: 100m
        memory: 100Mi
    securityContext:
      capabilities:
        drop: ["ALL"]
  schedulerName: runai-scheduler
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 5

Enforce the Scheduler at the Namespace Level

If modifying the workload YAML is not possible, you can enforce the use of the NVIDIA Run:ai Scheduler for all workloads in a given namespace (i.e., NVIDIA Run:ai project) by applying an annotation. Once applied, all workloads submitted to the annotated namespace will automatically use the NVIDIA Run:ai Scheduler without requiring individual YAML modifications.

Annotate the namespace with: runai/enforce-scheduler-name: true. For example, to annotate a project named proj-a, use the following command:

kubectl annotate ns runai-proj-a runai/enforce-scheduler-name=true

Verify the namespace in YAML format to see the annotation by running the following:

kubectl get ns runai-proj-a -o yaml

The following shows an example output:

apiVersion: v1
kind: Namespace
metadata:
  annotations:
    runai/enforce-scheduler-name: "true"
  creationTimestamp: "2024-04-09T08:15:50Z"
  labels:
    kubernetes.io/metadata.name: runai-proj-a
    runai/namespace-version: v2
    runai/queue: proj-a
  name: runai-proj-a
  resourceVersion: "388336"
  uid: c53af666-7989-43df-9804-42bf8965ce83
spec:
  finalizers:
  - kubernetes
status:
  phase: Active

PreviousHow the Scheduler Works NextQuick Starts

Last updated 7 months ago

Good morning

hashtagSpecify the Scheduler in the Workload YAML

hashtagEnforce the Scheduler at the Namespace Level

Specify the Scheduler in the Workload YAML

Enforce the Scheduler at the Namespace Level