# Cluster Restore

This section explains how to restore a NVIDIA Run:ai cluster on a different Kubernetes environment.

In the event of a critical Kubernetes failure or alternatively, if you want to migrate a NVIDIA Run:ai cluster to a new Kubernetes environment, simply reinstall the NVIDIA Run:ai cluster. Once you have reinstalled and reconnected the cluster, projects, workloads and other cluster data are synced automatically.

The restoration or backup of NVIDIA Run:ai [advanced cluster configurations](https://run-ai-docs.nvidia.com/saas/infrastructure-setup/advanced-setup/cluster-config) which are stored locally on the Kubernetes cluster is optional and can be restored and backed up separately.

## Back Up the Cluster

As back-up of data is not required, the backup procedure is optional for advanced deployments, as explained above.

### Save Cluster Configurations

To back up the NVIDIA Run:ai cluster configurations, you should save both the Helm values and the runtime configuration (`runaiconfig`).

1. **Back up Helm values** - Run the following command to export the Helm values used for deployment:

   ```bash
   helm get values runai-cluster -n runai > runai_cluster_values_backup.yaml
   ```
2. **Back up the runtime configuration (`runaiconfig`)** - Run the following command to export the active runtime configuration:

   ```bash
   kubectl get runaiconfig runai -n runai -o yaml -o=jsonpath='{.spec}' > runaiconfig_backup.yaml
   ```
3. Save both backup files (`runai_cluster_values_backup.yaml` and `runaiconfig_backup.yaml`) externally so they can be retrieved later if needed.

## Restore the Cluster

Follow the steps below to restore the NVIDIA Run:ai cluster on a new Kubernetes environment.

### Prerequisites

Before restoring the NVIDIA Run:ai cluster, it is essential to validate that it is both disconnected and uninstalled.

1. If the Kubernetes cluster is still available, [uninstall](https://run-ai-docs.nvidia.com/saas/getting-started/installation/install-using-helm/uninstall) the NVIDIA Run:ai cluster. Make sure not to remove the cluster from the control plane.
2. Navigate to the **Clusters** grid in the NVIDIA Run:ai UI
3. Locate the cluster and verify its status is **Disconnected**

### Re-install the Cluster

1. Follow the NVIDIA Run:ai cluster [installation](https://run-ai-docs.nvidia.com/saas/getting-started/installation/install-using-helm/helm-install) instructions and ensure all [prerequisites](https://run-ai-docs.nvidia.com/saas/getting-started/installation/install-using-helm/system-requirements) are met.
2. If you have a backup of the cluster configurations, reload it once the installation is complete:

   ```bash
   kubectl apply -f runaiconfig_backup.yaml -n runai
   ```
3. Navigate to the **Clusters** grid in the NVIDIA Run:ai UI
4. Locate the cluster and verify its status is **Connected**

### Restore Namespace and RoleBindings

If your cluster configuration disables automatic namespace creation for projects, you must manually:

* Re-create each project namespace
* Reapply the required role bindings for access control

For more information, see [Advanced cluster configurations](https://run-ai-docs.nvidia.com/saas/infrastructure-setup/advanced-setup/cluster-config).
