runai inference nim update

update a nim inference workload

Synopsis

IMPORTANT: Update operations OVERWRITE entire field sections, they do not merge with existing values. For example, if a workload has 10 environment variables and you update with -e NEW_VAR=value, the workload will have only 1 environment variable after the update.

runai inference nim update [WORKLOAD_NAME] [flags]

Examples

# Update image and GPU count
runai inference nim update <workload-name> -i nvcr.io/nim/meta/llama3-8b-instruct:v2 -g 2

# Update compute resources (replaces entire compute section)
runai inference nim update <workload-name> --cpu-core-request 4 --cpu-memory-request 16Gi -g 2

# Update environment variables (replaces all existing env vars)
runai inference nim update <workload-name> -e LOG_LEVEL=debug

# Switch from autoscaling to fixed replicas (explicit removal required)
runai inference nim update <workload-name> --remove autoscaling --replicas 5

# Update autoscaling configuration
runai inference nim update <workload-name> --min-replicas 1 --max-replicas 5 --metric concurrency --metric-threshold 10

Options

Options inherited from parent commands

SEE ALSO

  • runai inference nim - [Experimental] Runs NVIDIA NIM (NVIDIA Inference Microservices) workloads. Optimized for deploying foundation models.

Last updated