How to Deploy Dynamo Inference Pipelines

Dynamo Inference

Learn how to deploy Dynamo, a disaggregated inference framework, using NVIDIA Run:ai.

Note

This video was recorded using NVIDIA Run:ai version 2.24.18. The user interface, features, and workflows may differ in newer releases. For the latest information, refer to the current documentation.

What You'll Learn:

  • Deploy Dynamo inference workloads with NVIDIA Run:ai

  • Use YAML-based deployment workflows

  • Automatically recognize and manage Dynamo workloads

  • Place distributed inference services efficiently across the cluster

  • Use hierarchical gang scheduling to support distributed inference

  • Manage scalable inference services on the Run:ai platform

Last updated