Workload Priority Control
The workload priority management feature allows you to change the priority of a workload within a project. The priority determines the workload's position in the project scheduling queue managed by the NVIDIA Run:ai Scheduler. By adjusting the priority, you can increase the likelihood that a workload will be scheduled and preferred over others within the same project, ensuring that critical tasks are given higher priority and resources are allocated efficiently. The workload's priority also affects whether it can consume over-quota resources and whether it is subject to preemption by higher-priority workloads.
You can change the priority of a workload by selecting one of the predefined values from the NVIDIA Run:ai priority dictionary.
Priority Dictionary
Workload priority is defined by selecting a priority from a predefined list in the NVIDIA Run:ai priority dictionary. Each string corresponds to a specific Kubernetes PriorityClass, which in turn determines scheduling behavior, such as whether the workload is preemptible or allowed to run over quota.
very-low
25
Preemptible
Available
low
40
Preemptible
Available
medium-low
65
Preemptible
Available
medium
80
Preemptible
Available
medium-high
90
Preemptible
Available
high
125
Non-preemptible
Not available
very-high
150
Non-preemptible
Not available
Preemptible vs Non-Preemptible Workloads
Non-preemptible workloads must run within the project’s deserved quota, cannot use over-quota resources, and will not be interrupted once scheduled.
Preemptible workloads can use opportunistic compute resources beyond the project’s quota but may be interrupted at any time.
Default Priority per Workload
NVIDIA Run:ai defines the following default mappings of workload types to priorities. To retrieve the default priority per workload type, refer to the List workload types API.
NVIDIA Run:ai native workloads
Workspaces, Standard training, Distributed training, Custom inference, Hugging Face inference, NVIDIA NIM inference
Workspaces = high
Training = low
Inference = very-high
NVIDIA
NIM services, NVIDIA Cloud Functions (NVCF)
very-high
Kubernetes
Deployment, StatefulSet, ReplicaSet, Pod, Service, CronJob, Job, JobSet
Most = very-high
JobSet = low
Job = high
Kubeflow
TFJob, PyTorchJob, MPIJob, XGBoostJob, Notebook, ScheduledWorkflow
Most = low
Notebook= = high
ScheduledWorkflow = very-high
Ray
RayService, RayCluster, RayJob
RayService = very-high
RayCluster, RayJob = low
Tekton
PipelineRun, TaskRun
PipelineRun = very-high
TaskRun = high
Additional Frameworks
SeldonDeployment, AMLJob, DevWorkspace, VirtualMachineInstance, KServe, Milvus, Workflow
Most = very-high
DevWorkspace = high
AMLJob, VirtualMachineInstance = low
Setting Priority During Workload Submission
Set the priority when submitting NVIDIA Run:ai workloads via the UI, CLI, or API:
UI - Set workload priority under General settings (flexible submission only)
API - Set using the
PriorityClass
fieldCLI - Set using the
--priority
flag
Set the workload's priority by adding the following label to your YAML under the
metadata.labels
section of your workload definition and use the following values,very-low
,medium-low
,medium
,medium-high
,high
,very-high
:metadata: labels: priorityClassName: <priority>
Updating the Default Priority Mapping
Administrators can change the default priority assigned to a workload type by updating the priority mapping using the NVIDIA Run:ai API. To update the priority mapping:
Retrieve the list of workload types and their IDs using
GET /api/v1/workload-types
.Identify the
workloadTypeId
of the workload type you want to modify.Retrieve the list of available priorities and their IDs using
GET /api/v1/workload-priorities
.Send a request to update the workload type with the new priority using
PUT /api/v1/workload-types/{workloadTypeId}
and include thepriorityId
in the request body.
Using API
Go to the Workload priorities API reference to view the available actions.
Last updated