Using the Scheduler with Third-Party Workloads
By default, Kubernetes uses its own native scheduler to determine pod placement. The NVIDIA Run:ai platform provides a custom scheduler, runai-scheduler, which is used by default for workloads submitted using the NVIDIA Run:ai platform. This section outlines how to configure third-party workloads, such as those submitted directly to Kubernetes, to run with the NVIDIA Run:ai Scheduler, runai-scheduler, instead of the default Kubernetes scheduler.
Specify the Scheduler in the Workload YAML
To use the NVIDIA Run:ai Scheduler for third-party workloads, specify it in the workload’s YAML file. This instructs Kubernetes to schedule the workload using the NVIDIA Run:ai Scheduler instead of the default one.
spec:schedulerName: runai-schedulerFor example:
apiVersion: v1
kind: Pod
metadata:
annotations:
user: test
gpu-fraction: "0.5"
gpu-fraction-num-devices: "2"
labels:
runai/queue: test
name: multi-fractional-pod-job
namespace: test
spec:
containers:
- image: gcr.io/run-ai-demo/quickstart-cuda
imagePullPolicy: Always
name: job
env:
- name: RUNAI_VERBOSE
value: "1"
resources:
limits:
cpu: 200m
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
securityContext:
capabilities:
drop: ["ALL"]
schedulerName: runai-scheduler
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 5Enforce the Scheduler at the Namespace Level
If modifying the workload YAML is not possible, you can enforce the use of the NVIDIA Run:ai Scheduler for all workloads in a given namespace (i.e., NVIDIA Run:ai project) by applying an annotation. Once applied, all workloads submitted to the annotated namespace will automatically use the NVIDIA Run:ai Scheduler without requiring individual YAML modifications.
Annotate the namespace with:
runai/enforce-scheduler-name: true. For example, to annotate a project namedproj-a, use the following command:
Verify the namespace in YAML format to see the annotation by running the following:
The following shows an example output:
Last updated