How to Deploy Dynamo Inference Pipelines
Last updated
Learn how to deploy Dynamo, a disaggregated inference framework, using NVIDIA Run:ai.
Note
This video was recorded using NVIDIA Run:ai version 2.24.18. The user interface, features, and workflows may differ in newer releases. For the latest information, refer to the current documentation.
Deploy Dynamo inference workloads with NVIDIA Run:ai
Use YAML-based deployment workflows
Automatically recognize and manage Dynamo workloads
Place distributed inference services efficiently across the cluster
Use hierarchical gang scheduling to support distributed inference
Manage scalable inference services on the Run:ai platform
Last updated