# Data Volumes

Data volumes offer a powerful solution for storing, managing, and sharing AI training data within the NVIDIA Run:ai platform. They promote collaboration, simplify data access control, and streamline the AI development lifecycle.

Acting as a central repository for organizational data resources, data volumes can represent datasets or raw data, that is stored in Kubernetes Persistent Volume Claims (PVCs).

## Why Use a Data Volume?

1. **Sharing with multiple scopes**\
   Unlike other NVIDIA Run:ai data sources, data volumes can be shared across projects, departments, or clusters, encouraging data reuse and collaboration within the organization.
2. **Storage saving**\
   A single copy of the data can be used across multiple [scopes](/self-hosted/2.20/workloads-in-nvidia-run-ai/assets/overview.md#asset-scope)

## Typical Use Cases

1. **Sharing large data sets**\
   In large organizations, the data is often stored in a remote location, which can be a barrier for large model training. Even if the data is transferred into the cluster, sharing it easily with multiple users is still challenging. Data volumes can help share the data seamlessly, with maximum security and control.
2. **Sharing data with colleagues**\
   When sharing training results, generated data sets, or other artifacts with team members is needed, data volumes can help make the data available easily.

![data-volumes-architecture](/files/3urObgSF7zCNACqcDqXA)

## Prerequisites

To create a data volume, there must be a [project](/self-hosted/2.20/platform-management/aiinitiatives/organization/projects.md) with a PVC in its namespace.

Working with data volumes is currently available using the API. To view the available actions, go to the [Data volumes](https://run-ai-docs.nvidia.com/api/2.20/datavolumes/datavolumes) API reference.

## Adding a New Data Volume

Data volume creation is limited to [specific roles](/self-hosted/2.20/workloads-in-nvidia-run-ai/assets/overview.md#who-can-create-an-asset).

## Adding Scopes for a Data Volume

Data volume sharing (adding scopes) is limited to [specific roles](/self-hosted/2.20/workloads-in-nvidia-run-ai/assets/overview.md#who-can-create-an-asset).

Once created, the data volume is available to its originating project (see the prerequisites above).

Data volumes can be shared with additional scopes in the organization.

## Who Can Use a Data Volume?

Data volumes are used when [submitting workloads](/self-hosted/2.20/workloads-in-nvidia-run-ai/workloads.md#adding-new-workload). Any user, application or SSO group with a [role](/self-hosted/2.20/infrastructure-setup/authentication/roles.md) that has permissions to create workloads can also use data volumes.

Researchers can list available data volumes within their permitted scopes for easy selection.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://run-ai-docs.nvidia.com/self-hosted/2.20/workloads-in-nvidia-run-ai/assets/data-volumes.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
