# Upgrade

## Before Upgrade

Before proceeding with the upgrade, it's crucial to apply the specific prerequisites associated with your current version of NVIDIA Run:ai and every version in between up to the version you are upgrading to.

To ensure a smooth and supported upgrade process:

* **Align control plane and cluster versions** - For best results, upgrade the control plane and cluster components to the same NVIDIA Run:ai version during the same maintenance window. Keeping versions aligned helps avoid unexpected behavior caused by version mismatches and ensures full compatibility across platform components.
* **Upgrade order** - When performing an upgrade:
  * Upgrade the control plane Helm chart first
  * Upgrade the cluster Helm chart only after the control plane upgrade completes successfully

## Helm

NVIDIA Run:ai requires [Helm](https://helm.sh/) 3.14 or later. Before you continue, validate your installed helm client version. To install or upgrade Helm, see [Installing Helm](https://helm.sh/docs/intro/install/). If you are installing an air-gapped version of NVIDIA Run:ai, the NVIDIA Run:ai tar file contains the helm binary.

{% hint style="info" %}
**Note**

Helm 4 defaults to [server-side apply](https://helm.sh/docs/overview/#server-side-apply) when installing a new chart release, which can conflict with resources managed by the NVIDIA Run:ai operator. Append `--server-side=false` to your `helm upgrade` command. NVIDIA Run:ai clusters originally installed with Helm 3.x are unaffected.
{% endhint %}

## Software Files

{% tabs %}
{% tab title="Connected" %}
Run the following commands to add the NVIDIA Run:ai Helm repository and browse the available versions:

```bash
helm repo add runai-backend https://runai.jfrog.io/artifactory/cp-charts-prod
helm repo update
helm search repo -l runai-backend
```

{% endtab %}

{% tab title="Air-gapped" %}
Run the following command to browse all available air-gapped packages using the token provided by NVIDIA Run:ai.

To download and extract a specific version, and to upload the container images to your private registry, see the [Preparations](/self-hosted/2.23/getting-started/installation/install-using-helm/preparations.md) section.

```bash
curl -H "Authorization: Bearer <token>" "https://runai.jfrog.io/artifactory/api/storage/runai-airgapped-prod/?list"
```

{% endtab %}
{% endtabs %}

## Upgrade Control Plane

### System and Network Requirements

Before upgrading the NVIDIA Run:ai control plane, validate that the latest [system requirements](/self-hosted/2.23/getting-started/installation/install-using-helm/cp-system-requirements.md) and [network requirements](/self-hosted/2.23/getting-started/installation/install-using-helm/network-requirements.md) are met, as they can change from time to time.

### Upgrade

{% hint style="info" %}
**Note**

To upgrade to a specific version, modify the `--version` flag by specifying the desired `<VERSION>`. You can find all available versions by using the `helm search repo runai-backend/control-plane --versions` command.
{% endhint %}

**Upgrading from Version 2.16**

You must perform a two-step upgrade:

1. Upgrade to version 2.18:

{% tabs %}
{% tab title="Connected" %}

```bash
helm get values runai-backend -n runai-backend > runai_control_plane_values.yaml
helm upgrade runai-backend -n runai-backend runai-backend/control-plane --version "2.18.0" -f runai_control_plane_values.yaml --reset-then-reuse-values
```

{% endtab %}

{% tab title="Air-gapped" %}

```bash
helm get values runai-backend -n runai-backend > runai_control_plane_values.yaml
helm upgrade runai-backend control-plane-2.18.0.tgz -n runai-backend -f runai_control_plane_values.yaml --reset-then-reuse-values
```

{% endtab %}
{% endtabs %}

2. Then upgrade to the required version:

{% tabs %}
{% tab title="Connected" %}

```bash
helm get values runai-backend -n runai-backend > runai_control_plane_values.yaml
helm upgrade runai-backend -n runai-backend runai-backend/control-plane --version "<VERSION>" -f runai_control_plane_values.yaml --reset-then-reuse-values
```

{% endtab %}

{% tab title="Air-gapped" %}

```bash
helm get values runai-backend -n runai-backend > runai_control_plane_values.yaml
helm upgrade runai-backend control-plane-<NEW-VERSION>.tgz -n runai-backend -f runai_control_plane_values.yaml --reset-then-reuse-values
```

{% endtab %}
{% endtabs %}

**Upgrading from Version 2.17 or Later**

If your current version is 2.17 or higher, you can upgrade directly to the required version:

{% tabs %}
{% tab title="Connected" %}

```bash
helm get values runai-backend -n runai-backend > runai_control_plane_values.yaml
helm upgrade runai-backend -n runai-backend runai-backend/control-plane --version "<VERSION>" -f runai_control_plane_values.yaml --reset-then-reuse-values
```

{% endtab %}

{% tab title="Air-gapped" %}

```bash
helm get values runai-backend -n runai-backend > runai_control_plane_values.yaml
helm upgrade runai-backend control-plane-<NEW-VERSION>.tgz -n runai-backend -f runai_control_plane_values.yaml --reset-then-reuse-values
```

{% endtab %}
{% endtabs %}

## Upgrade Cluster

### System and Network Requirements

Before upgrading the NVIDIA Run:ai cluster, validate that the latest [system requirements](/self-hosted/2.23/getting-started/installation/install-using-helm/system-requirements.md) and [network requirements](/self-hosted/2.23/getting-started/installation/install-using-helm/network-requirements.md) are met, as they can change from time to time.

{% hint style="info" %}
**Note**

It is highly recommended to upgrade the Kubernetes version together with the NVIDIA Run:ai cluster version, to ensure compatibility with latest supported version of your [Kubernetes distribution](/self-hosted/2.23/getting-started/installation/install-using-helm/system-requirements.md#kubernetes-distribution).
{% endhint %}

### Getting Installation Instructions

Follow the setup and installation instructions below to get the installation instructions to upgrade the NVIDIA Run:ai cluster.

{% hint style="info" %}
**Note**

To upgrade to a specific version, modify the `--version` flag by specifying the desired `<VERSION>`. You can find all available versions by using the `helm search repo runai/runai-cluster --versions` command.
{% endhint %}

#### Setup

1. In the NVIDIA Run:ai UI, go to **Clusters**
2. Select the cluster you want to upgrade
3. Click **INSTALLATION INSTRUCTIONS**
4. Optional: Select the NVIDIA Run:ai cluster version (latest, by default)
5. Click **CONTINUE**

#### Installation Instructions

1. Follow the installation instructions. Run the Helm commands provided on your Kubernetes cluster. See the below if [installation fails](#installation-fails).
2. Click **DONE**
3. Once installation is complete, validate the cluster is **Connected** and listed with the new cluster version (see the [cluster troubleshooting scenarios](/self-hosted/2.23/infrastructure-setup/procedures/clusters.md#troubleshooting-scenarios)). Once you have done this, the cluster is upgraded to the latest version.

### Troubleshooting

If you encounter an issue with the cluster upgrade, use the troubleshooting scenarios below.

#### Installation Fails

If the NVIDIA Run:ai cluster installation failed, check the installation logs to identify the issue. Run the following script to print the installation logs:

{% file src="/files/2BCXbwVvSn6F1NwIjGZ5" %}

#### Cluster Status

If the NVIDIA Run:ai cluster upgrade completes, but the cluster status does not show as **Connected**, refer to [Troubleshooting scenarios](/self-hosted/2.23/infrastructure-setup/procedures/clusters.md#troubleshooting-scenarios).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://run-ai-docs.nvidia.com/self-hosted/2.23/getting-started/installation/install-using-helm/upgrade.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
