> For the complete documentation index, see [llms.txt](https://run-ai-docs.nvidia.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://run-ai-docs.nvidia.com/saas/reference/cli/runai/runai_inference.md).

# runai inference

inference management

## Options

```
  -h, --help   help for inference
```

## Options inherited from parent commands

```
      --config-file string   config file name; can be set by environment variable RUNAI_CLI_CONFIG_FILE (default "config.json")
      --config-path string   config path; can be set by environment variable RUNAI_CLI_CONFIG_PATH
  -d, --debug                enable debug mode
  -q, --quiet                enable quiet mode, suppress all output except error messages
      --verbose              enable verbose mode
```

## SEE ALSO

* [runai](/saas/reference/cli/runai.md) - Run:ai Command-line Interface
* [runai inference bash](/saas/reference/cli/runai/runai-inference-bash.md) - open a bash shell in an inference workload
* [runai inference delete](/saas/reference/cli/runai/runai_inference_delete.md) - delete an inference workload
* [runai inference describe](/saas/reference/cli/runai/runai_inference_describe.md) - describe an inference workload
* [runai inference distributed](/saas/reference/cli/runai/runai-inference-distributed.md) - Runs multiple coordinated inference processes across multiple nodes. Required for models too large to run on a single node.
* [runai inference exec](/saas/reference/cli/runai/runai-inference-exec.md) - execute a command in an inference workload
* [runai inference list](/saas/reference/cli/runai/runai_inference_list.md) - list inference workloads
* [runai inference logs](/saas/reference/cli/runai/runai-inference-logs.md) - view logs of an inference workload
* runai inference nim - \[Experimental] Runs NVIDIA NIM (NVIDIA Inference Microservices) workloads. Optimized for deploying foundation models.
* [runai inference port-forward](/saas/reference/cli/runai/runai-inference-port-forward.md) - forward one or more local ports to an inference workload
* [runai inference standard](/saas/reference/cli/runai/runai-inference-standard.md) - Runs a single inference process on one node. Suitable for smaller models or simpler inference tasks.
* [runai inference submit](/saas/reference/cli/runai/runai_inference_submit.md) - submit an inference workload
* [runai inference update](/saas/reference/cli/runai/runai_inference_update.md) - update an inference workload


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://run-ai-docs.nvidia.com/saas/reference/cli/runai/runai_inference.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.