Workload Priority Control

The workload priority management feature allows you to change the priority of a workload within a project. The priority determines the workload's position in the project scheduling queue managed by the NVIDIA Run:ai Scheduler. By adjusting the priority, you can increase the likelihood that a workload will be scheduled and preferred over others within the same project, ensuring that critical tasks are given higher priority and resources are allocated efficiently.

You can change the priority of a workload by selecting one of the predefined values from the NVIDIA Run:ai priority dictionary. This can be done using the NVIDIA Run:ai UI, API or CLI, depending on the workload type.

Note

This applies only within a single project. It does not impact the scheduling queues or workloads of other projects.

Priority Dictionary

Workload priority is defined by selecting a string name from a predefined list in the NVIDIA Run:ai priority dictionary. Each string corresponds to a specific Kubernetes PriorityClass, which in turn determines scheduling behavior, such as whether the workload is preemptible or allowed to run over quota.

Note

The numeric priority levels (1 = highest, 4 = lowest) are descriptive only and are not part of the NVIDIA Run:ai priority dictionary.

Priority Level

Name (string)

Preemption

Over Quota

inference

Non-preemptible

Not available

build

Non-preemptible

Not available

interactive-preemptible

Preemptible

Available

train

Preemptible

Available

Preemptible vs Non-Preemptible Workloads

Non-preemptible workloads must run within the project’s deserved quota, cannot use over-quota resources, and will not be interrupted once scheduled.
Preemptible workloads can use opportunistic compute resources beyond the project’s quota but may be interrupted at any time.

Default Priority per Workload

Both NVIDIA Run:ai and third-party workloads are assigned a default priority. The below table shows the default priority per workload type:

Workload Type

Default Priority

Workspaces

build

Training

train

Inference

inference

Third-party workloads

train

NVIDIA Cloud Functions (NVCF)

inference

Supported Priority Overrides per Workload

Note

Changing a workload’s priority may impact its ability to be scheduled. For example, switching a workload from a train priority (which allows over-quota usage) to build priority (which requires in-quota resources) may reduce its chances of being scheduled in cases where the required quota is unavailable.

The below table shows the default priority listed in the previous section and the supported override options per workload:

How to Override Priority

You can override the default priority when submitting a workload through the UI, API, or CLI depending on the workload type.

Workspaces

To use the override options:

UI: Enable "Allow the workload to exceed the project quota" when submitting a workspace
API: Set PriorityClass in the Workspaces API
CLI: Submit a workspace using the --priority flag
```
runai workspace submit --priority priority-class
```

Training Workloads

To use the override options:

API: Set PriorityClass in the Trainings API

CLI: Submit training using the --priority flag

runai training submit --priority priority-class

PreviousHow the Scheduler Works NextQuick Starts

Last updated 1 month ago