Running Workspaces

This section explains how to create a workspace via the NVIDIA Run:ai UI.

A workspace contains the setup and configuration needed for building your model, including the container, images, data sets, and resource requests, as well as the required tools for the research, all in a single place.

To learn more about the workspace workload type in NVIDIA Run:ai and determine that it is the most suitable workload type for your goals, see Workload types.

Before You Start

Make sure you have created a project or have one created for you.

Workload Priority

By default, workspaces in NVIDIA Run:ai are assigned a priority of high, which is non-preemptible. You can select a different priority when submitting a workspace. For more details on the available options, see Workload priority control.

Workload Policies

When creating a new workload, fields and assets may have limitations or defaults. These rules and defaults are derived from a policy your administrator set.

Policies allow you to control, standardize, and simplify the workload submission process. For additional information, see Policies and rules.

The effects of the policy are reflected in the workspace creation form:

Defaults derived from the policy will be displayed automatically for specific fields.
Disabled actions and permitted value ranges for values will be visibly explained per field.
Rules and defaults for entire sections (such as environments, compute resources, or data sources) may prevent selection and will appear on the entire library card with an option for additional information via an external modal.

Submission Form Options

You can create a new workspace using either the Flexible or Original submission form. The Flexible submission form offers greater customization and is the recommended method. Within the Flexible form, you have two options:

Load from an existing setup - You can select an existing setup to populate the workspace form with predefined values. While the Original submission form also allows you to select an existing setup, with the Flexible submission you can customize any of the populated fields for a one-time configuration. These changes will apply only to this workspace and will not modify the original setup. If needed, you can reset the configuration to the original setup at any time.
Provide your own settings - Manually fill in the workspace configuration fields. This is a one-time setup that applies only to the current workspace and will not be saved for future use.

Note

Flexible workload submission is disabled by default. If unavailable, your administrator must enable it under General settings → Workloads → Flexible workload submission.
The Original submission form will be deprecated in a future release.

Creating a New Workspace

To add a new workspace, go to Workload manager → Workloads.
Click +NEW WORKLOAD and select Workspace from the dropdown menu.
Within the new workspace form, select the cluster and project. To create a new project, click +NEW PROJECT and refer to Projects for a step-by-step guide.
Select a template or Start from scratch to launch a new workspace quickly. You can use a workload template to populate the workspace form with predefined configuration values. You can still modify the populated fields before submitting the workspace. Any changes you make will apply only to the current workspace and will not be saved back to the original template.
Enter a unique name for the workspace. If the name already exists in the project, you will be requested to submit a different name.
Click CONTINUE

Setting Up an Environment

Load from existing setup

Click the load icon. A side pane appears, displaying a list of available environments. Select an environment from the list.
Optionally, customize any of the environment’s predefined fields as shown below. The changes will apply to this workspace only and will not affect the selected environment.
Alternatively, click the ➕ icon in the side pane to create a new environment. For step-by-step instructions, see Environments.

Provide your own settings

Manually configure the settings below as needed. The changes will apply to this workspace only.

Configure environment

Add the Image URL or update the URL of the existing setup.
Set the image pull policy:
- Set the condition for pulling the image. It is recommended to pull the image only if it's not already present on the host.
- Set the secret for pulling the image. Provide a Kubernetes secret that contains the required Docker registry authentication credentials. The secret must already exist in the same namespace as the workload and can be selected from the Secret name dropdown. This field appears only if you previously created Docker registry credentials in the User settings.
Set the connection for your tool(s). If you are loading from existing setup, the tools are configured as part of the environment.
- Select the connection type - External URL or NodePort:
  - Auto generate - A unique URL / port is automatically created for each workload using the environment.
  - Custom URL / Custom port - Manually define the URL or port. For custom port, make sure to enter a port between 30000 and 32767. If the node port is already in use, the workload will fail and display an error message.
- Modify who can access the tool:
  - By default, All authenticated users is selected giving access to everyone within the organization’s account.
  - For Specific group(s), enter group names as they appear in your identity provider. You must be a member of one of the groups listed to have access to the tool.
  - For Specific user(s), enter a valid email address or username. If you remove yourself, you will lose access to the tool.
Set the command and arguments for the container running the workspace. If no command is added, the container will use the image’s default command (entry-point):
- Modify the existing command or click +COMMAND & ARGUMENTS to add a new command.
- Set multiple arguments separated by spaces, using the following format (e.g.: --arg1=val1).
Set the environment variable(s):
- Modify the existing environment variable(s) if you are loading from an existing setup. The existing environment variables may include instructions to guide you with entering the correct values.
- To add a new variable, click + ENVIRONMENT VARIABLE.
- You can either select Custom to define your own variable, or choose from a predefined list of Secrets or ConfigMaps.
- You can also select My credentials and enter your key to inject an appropriate value from a generic secret you created in the User settings. The credentials must already exist in the same namespace as the workload and can be selected from the Credential name dropdown.
Enter a path pointing to the container's working directory.
Set where the UID, GID, and supplementary groups for the container should be taken from. If you select Custom, you’ll need to manually enter the UID, GID and Supplementary groups values.
Select additional Linux capabilities for the container from the dropdown menu. This grants certain privileges to a container without granting all the root user's privileges.

Select an environment or click +NEW ENVIRONMENT to add a new environment to the gallery. For a step-by-step guide on adding environments to the gallery, see Environments. Once created, the new environment will be automatically selected.
Set the connection for your tool(s). If you are loading from existing setup, the tools are configured as part of the environment.
- Select the connection type - External URL or NodePort:
  - Auto generate - A unique URL / port is automatically created for each workload using the environment.
  - Custom URL / Custom port - Manually define the URL or port. For custom port, make sure to enter a port between 30000 and 32767. If the node port is already in use, the workload will fail and display an error message.
- Optional: Modify who can access the tool:
  - By default, All authenticated users is selected giving access to everyone within the organization’s account.
  - For Specific group(s), enter group names as they appear in your identity provider. You must be a member of one of the groups listed to have access to the tool.
  - For Specific user(s), enter a valid email address or username. If you remove yourself, you will lose access to the tool.
- Set the User ID (UID), Group ID (GID) and the Supplementary groups that can run commands in the container.
Optional: Set the command and arguments for the container running the workload. If no command is added, the container will use the image’s default command (entry-point):
- Modify the existing command or click +COMMAND & ARGUMENTS to add a new command.
- Set multiple arguments separated by spaces, using the following format (e.g.: --arg1=val1).
Set the environment variable(s):
- Modify the existing environment variable(s). The existing environment variables may include instructions to guide you in entering the correct values.
- Optional: To add a new variable, click + ENVIRONMENT VARIABLE.
- You can either select Custom to define your own variable, or choose from a predefined list of Credentials or ConfigMaps.

Setting Up Compute Resources

Note

GPU memory limit is disabled by default. If unavailable, your administrator must enable it under General settings → Resources → GPU resource optimization.

Load from existing setup

Click the load icon. A side pane appears, displaying a list of available compute resources. Select a compute resource from the list.
Optionally, customize any of the compute resource's predefined fields. The changes will apply to this workspace only and will not affect the selected compute resource.
Alternatively, click the ➕ icon in the side pane to create a new compute resource. For step-by-step instructions, see Compute resources.

Provide your own settings

Manually configure the settings below as needed. The changes will apply to this workspace only.

Configure compute resources

Set the number of GPU devices per pod (physical GPUs).
Enable GPU fractioning to set the GPU memory per device using either a fraction of a GPU device’s memory (% of device) or a GPU memory unit (MB/GB):
- Request - The minimum GPU memory allocated per device. Each pod in the workspace receives at least this amount per device it uses.
- Limit - The maximum GPU memory allocated per device. Each pod in the workspace receives at most this amount of GPU memory for each device(s) the pod utilizes. This is disabled by default, to enable see the above note.
Set the CPU resources
- Set CPU compute resources per pod by choosing the unit (cores or millicores):
  - Request - The minimum amount of CPU compute provisioned per pod. Each running pod receives this amount of CPU compute.
  - Limit - The maximum amount of CPU compute a pod can use. Each pod receives at most this amount of CPU compute. By default, the limit is set to Unlimited which means that the pod may consume all the node's free CPU compute resources.
- Set the CPU memory per pod by selecting the unit (MB or GB):
  - Request - The minimum amount of CPU memory provisioned per pod. Each running pod receives this amount of CPU memory.
  - Limit - The maximum amount of CPU memory a pod can use. Each pod receives at most this amount of CPU memory. By default, the limit is set to Unlimited which means that the pod may consume all the node's free CPU memory resources.
Set extended resource(s)
- Enable Increase shared memory size to allow the shared memory size available to the pod to increase from the default 64MB to the node's total available memory or the CPU memory limit, if set above.
- Click +EXTENDED RESOURCES to add resource/quantity pairs. For more information on how to set extended resources, see the Extended resources and Quantity guides.
Set the order of priority for the node pools on which the Scheduler tries to run the workspace. When a workspace is created, the Scheduler will try to run it on the first node pool on the list. If the node pool doesn't have free resources, the Scheduler will move on to the next one until it finds one that is available:
- Drag and drop them to change the order, remove unwanted ones, or reset to the default order defined in the project.
- Click +NODE POOL to add a new node pool from the list of node pools that were defined on the cluster. To configure a new node pool and for additional information, see Node pools.
Select a node affinity to schedule the workspace on a specific node type. If the administrator added a ‘node type (affinity)’ scheduling rule to the project/department, then this field is mandatory. Otherwise, entering a node type (affinity) is optional. Nodes must be tagged with a label that matches the node type key and value.
Click +TOLERATION to allow the workspace to be scheduled on a node with a matching taint. Select the operator and the effect:
- If you select Exists, the effect will be applied if the key exists on the node.
- If you select Equals, the effect will be applied if the key and the value set match the value on the node.

Setting Up Data & Storage

Note

Data volumes is disabled by default. If unavailable, your administrator must enable it under General settings → Workloads → Data volumes. Data volumes are available for flexible workload submission only.
Flexible - If Data volumes is not enabled, Data & storage appears as Data sources only, and no data volumes will be available.
Original - This tab outlines how to set Volumes and Data sources.

Load from existing setup

Click the load icon. A side pane appears, displaying a list of available data sources/volumes. Select a data source/volume from the list.
Optionally, customize any of the data source's predefined fields as shown below. The changes will apply to this workspace only and will not affect the selected data source.
Alternatively, click the ➕ icon in the side pane to create a new data source/data volume. For step-by-step instructions, see Data sources or Data volumes.

Provide your own settings

Manually configure the settings below as needed. The changes will apply to this workspace only.

Note: Secrets, ConfigMaps and Data volumes cannot be added as a one-time configuration.

Configure data sources

Click the ➕ icon and choose the data source from the dropdown menu. You can add multiple data sources.
Once selected, set the data origin according to the required fields and enter the container path to set the data target location.
- For Git and S3, select a Secret. This option is relevant for private buckets/repositories based on existing secrets that were created for the scope.
- For ConfigMap, set a Sub path. This refers to the specific file (key) inside the ConfigMap to mount (e.g., app.properties), allowing you to mount a specific file from the ConfigMap.
Select Volume to allocate a storage space to your workspace that is persistent across restarts:
- Set the Storage class to None or select an existing storage class from the list. To add new storage classes, and for additional information, see Kubernetes storage classes. If the administrator defined the storage class configuration, the rest of the fields will appear accordingly.
- Select one or more access mode(s) and define the claim size and its units.
- Select the volume mode. If you select Filesystem (default), the volume will be mounted as a filesystem, enabling the usage of directories and files. If you select Block, the volume is exposed as a block storage, which can be formatted or used directly by applications without a filesystem.
- Set the Container path with the volume target location.
- Set the volume persistency to Persistent if the volume and its data should be deleted when the workspace is deleted or Ephemeral if the volume and its data should be deleted every time the workspace’s status changes to “Stopped”.

Setting Up General Settings

Note

The following general settings are optional.

Set the workload priority. Choose the appropriate priority level for the workload. Higher-priority workloads are scheduled before lower-priority ones.
Set the backoff limit before workload failure. The backoff limit is the maximum number of retry attempts for failed workloads. After reaching the limit, the workload status will change to "Failed." Enter a value between 0 and 100.
Set the timeframe for auto-deletion after workload completion or failure. The time after which a completed or failed workload is deleted; if this field is set to 0 seconds, the workload will be deleted automatically.
Set annotations(s). Kubernetes annotations are key-value pairs attached to the workload. They are used for storing additional descriptive metadata to enable documentation, monitoring and automation.
Set labels(s). Kubernetes labels are key-value pairs attached to the workload. They are used for categorizing to enable querying.

Completing the Workspace

Before finalizing your workspace, review your configurations and make any necessary adjustments.
Click CREATE WORKSPACE

Managing and Monitoring

After the workspace is created, it is added to the Workloads table, where it can be managed and monitored.

Using CLI

To view the available actions on workspaces, see the Workspaces CLI v2 reference or the CLI v1 reference.

Using API

To view the available actions on workspaces, see the Workspaces API reference.

PreviousExperiment Using Workspaces NextQuick Starts

Last updated 1 day ago