Workload Priority Control

The workload priority management feature allows you to change the priority of a workload within a project. The priority determines the workload's position in the project scheduling queue managed by the NVIDIA Run:ai Scheduler. By adjusting the priority, you can increase the likelihood that a workload will be scheduled and preferred over others within the same project, ensuring that critical tasks are given higher priority and resources are allocated efficiently. The workload's priority also affects whether it can consume over-quota resources and whether it is subject to preemption by higher-priority workloads.

You can change the priority of a workload by selecting one of the predefined values from the NVIDIA Run:ai priority dictionary.

Note

This applies only within a single project. It does not impact the scheduling queues or workloads of other projects.

Priority Dictionary

Workload priority is defined by selecting a priority from a predefined list in the NVIDIA Run:ai priority dictionary. Each string corresponds to a specific Kubernetes PriorityClass, which in turn determines scheduling behavior, such as whether the workload is preemptible or allowed to run over quota.

Priority
Kubernetes Value
Preemption
Over Quota

very-low

25

Preemptible

Available

low

40

Preemptible

Available

medium-low

65

Preemptible

Available

medium

80

Preemptible

Available

medium-high

90

Preemptible

Available

high

125

Non-preemptible

Not available

very-high

150

Non-preemptible

Not available

Preemptible vs Non-Preemptible Workloads

  • Non-preemptible workloads must run within the project’s deserved quota, cannot use over-quota resources, and will not be interrupted once scheduled.

  • Preemptible workloads can use opportunistic compute resources beyond the project’s quota but may be interrupted at any time.

Default Priority per Workload

NVIDIA Run:ai defines the following default mappings of workload types to priorities. To retrieve the default priority per workload type, refer to the List workload types API.

Note

Workload Types
Default Priority

NVIDIA Run:ai native workloads

Workspaces, Standard training, Distributed training, Custom inference, Hugging Face inference, NVIDIA NIM inference

Workspaces = high Training = low Inference = very-high

NVIDIA

NIM services, NVIDIA Cloud Functions (NVCF)

very-high

Kubernetes

Deployment, StatefulSet, ReplicaSet, Pod, Service, CronJob, Job, JobSet

Most = very-high JobSet = low Job = high

Kubeflow

TFJob, PyTorchJob, MPIJob, XGBoostJob, Notebook, ScheduledWorkflow

Most = low Notebook= = high ScheduledWorkflow = very-high

Ray

RayService, RayCluster, RayJob

RayService = very-high RayCluster, RayJob = low

Tekton

PipelineRun, TaskRun

PipelineRun = very-high TaskRun = high

Additional Frameworks

SeldonDeployment, AMLJob, DevWorkspace, VirtualMachineInstance, KServe, Milvus, Workflow

Most = very-high DevWorkspace = high AMLJob, VirtualMachineInstance = low

Setting Priority During Workload Submission

Note

Changing a workload’s priority may impact its ability to be scheduled. For example, switching a workload from a low priority (which allows over-quota usage) to high priority (which requires in-quota resources) may reduce its chances of being scheduled in cases where the required quota is unavailable.

  • Set the priority when submitting NVIDIA Run:ai workloads via the UI, CLI, or API:

    • UI - Set workload priority under General settings (flexible submission only)

    • API - Set using the PriorityClass field

    • CLI - Set using the --priority flag

  • Set the workload's priority by adding the following label to your YAML under the metadata.labels section of your workload definition and use the following values, very-low, medium-low, medium, medium-high, high, very-high :

    metadata:
      labels:
        priorityClassName: <priority>

Updating the Default Priority Mapping

Administrators can change the default priority assigned to a workload type by updating the priority mapping using the NVIDIA Run:ai API. To update the priority mapping:

  1. Retrieve the list of workload types and their IDs using GET /api/v1/workload-types.

  2. Identify the workloadTypeId of the workload type you want to modify.

  3. Retrieve the list of available priorities and their IDs using GET /api/v1/workload-priorities.

  4. Send a request to update the workload type with the new priority using PUT /api/v1/workload-types/{workloadTypeId} and include the priorityId in the request body.

Using API

Go to the Workload priorities API reference to view the available actions.

Last updated