LogoLogo
Contact support
  • Home
  • SaaS
  • Self-hosted
  • Multi-tenant
  • Getting Started
    • Overview
    • What's New
    • Installation
      • System Requirements
      • Network Requirements
      • Install Using Helm
      • Install Using Base Command Manager
      • Customized Installation
      • Upgrade
      • Uninstall
  • Infrastructure setup
    • Authentication and Authorization
      • Authentication and Authorization
      • Users
      • SSO
        • Set Up SSO with SAML
        • Set Up SSO with OpenID Connect
        • Set Up SSO with OpenShift
      • Roles
      • Applications
      • Access Rules
      • Cluster Authentication
    • Advanced Setup
      • Node Roles
      • Advanced Cluster Configurations
      • Container Access
        • External Access to Containers
        • User Identity in Containers
      • Service Mesh
      • Integrations
        • Interworking with Karpenter
    • Infrastructure Procedures
      • NVIDIA Run:ai at Scale
      • Monitoring and Maintenance
      • NVIDIA Run:ai System Monitoring
      • Clusters
      • Shared Storage
      • Nodes Maintenance
      • Cluster Restore
      • Secure Your Cluster
      • Compliance
      • Logs Collection
      • Event History
  • Platform management
    • Manage AI Initiatives
      • Adapting AI Initiatives to Your Organization
      • Managing Your Organization
        • Projects
        • Departments
      • Managing Your Resources
        • Nodes
        • Configuring NVIDIA MIG Profiles
        • Using GB200 NVL72 and Multi-Node NVLink Domains
        • Node Pools
    • Scheduling and Resource Optimization
      • Scheduling
        • The NVIDIA Run:ai Scheduler: Concepts and Principles
        • How the Scheduler Works
        • Using the Scheduler with Third-Party Workloads
        • Workload Priority Control
        • Quick Starts
          • Over Quota, Fairness and Preemption
      • Resource Optimization
        • GPU Fractions
        • Dynamic GPU Fractions
        • Optimize Performance with Node Level Scheduler
        • GPU Time-Slicing
        • GPU Memory Swap
        • CPU Compute and Memory Allocation
        • Quick Starts
          • Launching Workloads with GPU Fractions
          • Launching Workloads with Dynamic GPU Fractions
          • Launching Workloads with GPU Memory Swap
    • Policies
      • Policies and Rules
      • Workload Policies
      • Policy YAML Examples
      • Policy YAML Reference
      • Scheduling Rules
    • Monitor Performance and Health
      • Before You Start
      • Monitor Workloads by Category
      • Metrics and Telemetry
      • Reports
  • Workloads in NVIDIA Run:ai
    • Introduction to Workloads
    • NVIDIA Run:ai Workload Types
    • Workloads
    • Workload Assets
      • Workload Assets
      • Environments
      • Data Sources
      • Data Volumes
      • Compute Resources
      • Credentials
    • Workload Templates
      • Workspace Templates
        • Workspace Templates (Legacy)
      • Training Templates
        • Standard Training Templates
        • Distributed Training Templates
    • Experiment Using Workspaces
      • Running Workspaces
      • Quick Starts
        • Running Jupyter Notebooks Using Workspaces
    • Train Models Using Training
      • Standard Training
        • Train Models Using a Standard Training Workload
        • Quick Starts
          • Run Your First Standard Training
      • Distributed Training
        • Train Models Using a Distributed Training Workload
        • Quick Starts
          • Run Your First Distributed Training
      • Best Practices: Checkpointing Preemptible Training Workloads
    • Deploy Models Using Inference
      • Deploy a Custom Inference Workload
      • Deploy Inference Workloads from Hugging Face
      • Deploy Inference Workloads with NVIDIA NIM
      • Deploy NVIDIA Cloud Functions (NVCF) in NVIDIA Run:ai
      • Quick Starts
        • Run Your First Custom Inference Workload
  • Settings
    • General Settings
    • User Settings
      • Email Notifications
      • User Applications
      • User Credentials
  • Reference
    • CLI Reference
      • Install and Configure CLI
      • Add NVIDIA Run:ai Authorization to Kubeconfig
      • Auto Update CLI Mechanism
      • CLI Commands Reference
        • runai cluster
        • runai config
        • runai department
        • runai inference
        • runai jax
        • runai kubeconfig
        • runai login
        • runai logout
        • runai mpi
        • runai node
        • runai nodepool
        • runai project
        • runai pvc
        • runai pytorch
        • runai report
        • runai tensorflow
        • runai training
        • runai upgrade
        • runai version
        • runai whoami
        • runai workload
        • runai workspace
        • runai xgboost
      • CLI Commands Examples
      • Administrator CLI
    • API Reference
      • How to Authenticate to the API
      • NVIDIA Run:ai REST API
      • API Usage Guides
        • Configuring Slack Notifications
        • Using Node Affinity via API
    • API Python Client Reference
      • API Python Client (runapy)
      • Install and Configure the Client
  • Support Policy
    • Product Support Policy
  • Product Version Life Cycle
On this page
Export as PDF
  1. Platform management
  2. Scheduling and Resource Optimization
  3. Resource Optimization
  4. Quick Starts

Launching Workloads with Dynamic GPU Fractions

PreviousLaunching Workloads with GPU FractionsNextLaunching Workloads with GPU Memory Swap

Last updated 1 month ago

LogoLogo

Corporate Info

  • NVIDIA.com Home
  • About NVIDIA
  • Privacy Policy
  • Manage My Privacy
  • Terms of Service

NVIDIA Developer

  • Developer Home
  • Blog

Resources

  • Contact Us
  • Developer Program

Copyright © 2025, NVIDIA Corporation.

CtrlK
  • Prerequisites
  • Step 1: Logging In
  • Step 2: Submitting the First Workspace
  • Step 3: Submitting the Second Workspace
  • Step 4: Connecting to the Jupyter Notebook
  • Next Steps

This quick start provides a step-by-step walkthrough for running a Jupyter Notebook with dynamic GPU fractions.

NVIDIA Run:ai’s dynamic GPU fractions optimizes GPU utilization by enabling workloads to dynamically adjust their resource usage. It allows users to specify a guaranteed fraction of GPU memory and compute resources with a higher limit that can be dynamically utilized when additional resources are requested.

Prerequisites

Before you start, make sure:

  • You have created a project or have one created for you.

  • The project has an assigned quota of at least 0.5 GPU.

  • Dynamic GPU fractions is enabled.

Note

  • Flexible workload submission is disabled by default. If unavailable, your administrator must enable it under General Settings → Workloads → Flexible workload submission.

  • Dynamic GPU fractions is disabled by default in the NVIDIA Run:ai UI. To use dynamic GPU fractions, it must be enabled by your Administrator, under General Settings → Resources → GPU resource optimization.

Step 1: Logging In

Browse to the provided NVIDIA Run:ai user interface and log in with your credentials.

Run the below --help command to obtain the login options and log in according to your setup:

runai login --help

To use the API, you will need to obtain a token as shown in API authentication.

Step 2: Submitting the First Workspace

  1. Go to the Workload manager → Workloads

  2. Click +NEW WORKLOAD and select Workspace

  3. Select under which cluster to create the workload

  4. Select the project in which your workspace will run

  5. Select Start from scratch to launch a new workspace quickly

  6. Enter a name for the workspace (if the name already exists in the project, you will be requested to submit a different name)

  7. Click CONTINUE

    In the next step:

  8. Click the load icon. A side pane appears, displaying a list of available environments. To add a new environment:

    • Click the + icon to create a new environment

    • Enter quick-start as the name for the environment. The name must be unique.

    • Enter the Image URL - gcr.io/run-ai-lab/pytorch-example-jupyter

    • Tools - Set the connection for your tool:

      • Click +TOOL

      • Select Jupyter tool from the list

    • Set the runtime settings for the environment. Click +COMMAND & ARGUMENTS and add the following:

      • Enter the command - start-notebook.sh

      • Enter the arguments - --NotebookApp.base_url=/${RUNAI_PROJECT}/${RUNAI_JOB_NAME} --NotebookApp.token=''

      Note: If host-based routing is enabled on the cluster, enter the --NotebookApp.token='' only.

    • Click CREATE ENVIRONMENT

    • Select the newly created environment from the side pane

  9. Click the load icon. A side pane appears, displaying a list of available compute resources. To add a new compute resource:

    • Click the + icon to create a new compute resource

    • Enter request-limit as the name for the compute resource. The name must be unique.

    • Set GPU devices per pod - 1

    • Enable GPU fractioning to set the GPU memory per device:

      • Select GB - Fraction of a GPU device’s memory

      • Set the memory Request - 4GB (the workload will allocate 4GB of the GPU memory)

      • Set the memory Limit - 12GB

    • Optional: set the CPU compute per pod - 0.1 cores (default)

    • Optional: set the CPU memory per pod - 100 MB (default)

    • Select More settings and toggle Increase shared memory size

    • Click CREATE COMPUTE RESOURCE

    • Select the newly created compute resource from the side pane

  10. Click CREATE WORKSPACE

  1. Go to the Workload manager → Workloads

  2. Click +NEW WORKLOAD and select Workspace

  3. Select under which cluster to create the workload

  4. Select the project in which your workspace will run

  5. Select Start from scratch to launch a new workspace quickly

  6. Enter a name for the workspace (if the name already exists in the project, you will be requested to submit a different name)

  7. Click CONTINUE

    In the next step:

  8. Create an environment for your workspace

    • Click +NEW ENVIRONMENT

    • Enter quick-start as the name for the environment. The name must be unique.

    • Enter the Image URL - gcr.io/run-ai-lab/pytorch-example-jupyter

    • Tools - Set the connection for your tool

      • Click +TOOL

      • Select Jupyter tool from the list

    • Set the runtime settings for the environment. Click +COMMAND & ARGUMENTS and add the following:

      • Enter the command - start-notebook.sh

      • Enter the arguments - --NotebookApp.base_url=/${RUNAI_PROJECT}/${RUNAI_JOB_NAME} --NotebookApp.token=''

      Note: If host-based routing is enabled on the cluster, enter the --NotebookApp.token='' only.

    • Click CREATE ENVIRONMENT

    The newly created environment will be selected automatically

  9. Create a new “request-limit” compute resource for your workspace

    • Click +NEW COMPUTE RESOURCE

    • Enter request-limit as the name for the compute resource. The name must be unique.

    • Set GPU devices per pod - 1

    • Enable GPU fractioning to set the GPU memory per device:

      • Select GB - Fraction of a GPU device’s memory

      • Set the memory Request - 4GB (the workload will allocate 4GB of the GPU memory)

      • Set the memory Limit - 12GB

    • Optional: set the CPU compute per pod - 0.1 cores (default)

    • Optional: set the CPU memory per pod - 100 MB (default)

    • Select More settings and toggle Increase shared memory size

    • Click CREATE COMPUTE RESOURCE

    The newly created compute resource will be selected automatically

  10. Click CREATE WORKSPACE

Copy the following command to your terminal. Make sure to update the below with the name of your project and workload. For more details, see CLI reference:

runai project set "project-name"
runai workspace submit "workload-name" \
--image gcr.io/run-ai-lab/pytorch-example-jupyter \
--gpu-memory-request 4G --gpu-memory-limit 12G --large-shm \
--external-url container=8888 --name-prefix jupyter  \
--command -- start-notebook.sh \
--NotebookApp.base_url=/${RUNAI_PROJECT}/${RUNAI_JOB_NAME} --NotebookApp.token=

Copy the following command to your terminal. Make sure to update the below parameters. For more details, see Workspaces API:

curl -L 'https://<COMPANY-URL>/api/v1/workloads/workspaces' \ 
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <TOKEN>' \ 
-d '{ 
    "name": "workload-name", 
    "projectId": "<PROJECT-ID>", 
    "clusterId": "<CLUSTER-UUID>",
    "spec": {
        "command" : "start-notebook.sh",
        "args" : "--NotebookApp.base_url=/${RUNAI_PROJECT}/${RUNAI_JOB_NAME} --NotebookApp.token=''",
        "image": "gcr.io/run-ai-lab/pytorch-example-jupyter",
        "compute": {
            "gpuDevicesRequest": 1,
            "gpuMemoryRequest": "4G",
            "gpuMemoryLimit": "12G",
            "largeShmRequest": true

        },
        "exposedUrls" : [
            { 
                "container" : 8888,
                "toolType": "jupyter-notebook", 
                "toolName": "Jupyter"  
            }
        ]
    }
}
  • <COMPANY-URL> - The link to the NVIDIA Run:ai user interface

  • <TOKEN> - The API access token obtained in Step 1

  • <PROJECT-ID> - The ID of the Project the workload is running on. You can get the Project ID via the Get Projects API.

  • <CLUSTER-UUID> - The unique identifier of the Cluster. You can get the Cluster UUID via the Get Clusters API.

  • toolType will show the Jupyter icon when connecting to the Jupyter tool via the user interface.

  • toolName will show when connecting to the Jupyter tool via the user interface.

Note

The above API snippet runs with NVIDIA Run:ai clusters of 2.18 and above only.

Step 3: Submitting the Second Workspace

  1. Go to the Workload manager → Workloads

  2. Click +NEW WORKLOAD and select Workspace

  3. Select the cluster where the previous workspace was created

  4. Select the project where the previous workspace was created

  5. Select Start from scratch to launch a new workspace quickly

  6. Enter a name for the workspace (if the name already exists in the project, you will be requested to submit a different name)

  7. Click CONTINUE

    In the next step:

  8. Click the load icon. A side pane appears, displaying a list of available environments. Select the environment created in Step 2.

  9. Click the load icon. A side pane appears, displaying a list of available compute resources. Select the compute resources created in Step 2.

  10. Click CREATE WORKSPACE

  1. Go to the Workload manager → Workloads

  2. Click +NEW WORKLOAD and select Workspace

  3. Select the cluster where the previous workspace was created

  4. Select the project where the previous workspace was created

  5. Select Start from scratch to launch a new workspace quickly

  6. Enter a name for the workspace (if the name already exists in the project, you will be requested to submit a different name)

  7. Click CONTINUE

    In the next step:

  8. Select the environment created in Step 2

  9. Select the compute resource created in Step 2

  10. Click CREATE WORKSPACE

Copy the following command to your terminal. Make sure to update the below with the name of your project and workload. For more details, see CLI reference:

runai project set "project-name"
runai workspace submit "workload-name" \
--image gcr.io/run-ai-lab/pytorch-example-jupyter --gpu-memory-request 4G \
--gpu-memory-limit 12G --large-shm --external-url container=8888 \
--name-prefix jupyter --command -- start-notebook.sh \
--NotebookApp.base_url=/${RUNAI_PROJECT}/${RUNAI_JOB_NAME} --NotebookApp.token=

Copy the following command to your terminal. Make sure to update the below parameters. For more details, see Workspaces API:

curl -L 'https://<COMPANY-URL>/api/v1/workloads/workspaces' \ 
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <TOKEN>' \ 
-d '{ 
    "name": "workload-name", 
    "projectId": "<PROJECT-ID>", 
    "clusterId": "<CLUSTER-UUID>",
    "spec": {
        "command" : "start-notebook.sh",
        "args" : "--NotebookApp.base_url=/${RUNAI_PROJECT}/${RUNAI_JOB_NAME} --NotebookApp.token=''",
        "image": "gcr.io/run-ai-lab/pytorch-example-jupyter",
        "compute": {
            "gpuDevicesRequest": 1,
            "gpuMemoryRequest": "4G",
            "gpuMemoryLimit": "12G",
            "largeShmRequest": true

        },
        "exposedUrls" : [
            { 
                "container" : 8888,
                "toolType": "jupyter-notebook",  
                "toolName": "Jupyter" 
            }
        ]
    }
}
  • <COMPANY-URL> - The link to the NVIDIA Run:ai user interface

  • <TOKEN> - The API access token obtained in Step 1

  • <PROJECT-ID> - The ID of the Project the workload is running on. You can get the Project ID via the Get Projects API.

  • <CLUSTER-UUID> - The unique identifier of the Cluster. You can get the Cluster UUID via the Get Clusters API.

  • toolType will show the Jupyter icon when connecting to the Jupyter tool via the user interface.

  • toolName will show when connecting to the Jupyter tool via the user interface.

Note

The above API snippet runs with NVIDIA Run:ai clusters of 2.18 and above only.

Step 4: Connecting to the Jupyter Notebook

  1. Select the newly created workspace with the Jupyter application that you want to connect to

  2. Click CONNECT

  3. Select the Jupyter tool. The selected tool is opened in a new tab on your browser.

  4. Open a terminal and use the watch nvidia-smi to get a constant reading of the memory consumed by the pod. Note that the number shown in the memory box is the Limit and not the Request or Guarantee.

  5. Open the file Untitled.ipynb and move the frame so you can see both tabs

  6. Execute both cells in Untitled.ipynb. This will consume about 3 GB of GPU memory and be well below the 4GB of the GPU Memory Request value.

  7. In the second cell, edit the value after --image-size from 100 to 200 and run the cell. This will increase the GPU memory utilization to about 11.5 GB which is above the Request value.

  1. To connect to the Jupyter Notebook, browse directly to https://<COMPANY-URL>/<PROJECT-NAME>/<WORKLOAD-NAME>

  2. Open a terminal and use the watch nvidia-smi to get a constant reading of the memory consumed by the pod. Note that the number shown in the memory box is the Limit and not the Request or Guarantee.

  3. Open the file Untitled.ipynb and move the frame so you can see both tabs

  4. Execute both cells in Untitled.ipynb. This will consume about 3 GB of GPU memory and be well below the 4GB of the GPU Memory Request value.

  5. In the second cell, edit the value after --image-size from 100 to 200 and run the cell. This will increase the GPU memory utilization to about 11.5 GB which is above the Request value.

  1. To connect to the Jupyter Notebook, browse directly to https://<COMPANY-URL>/<PROJECT-NAME>/<WORKLOAD-NAME>

  2. Open a terminal and use the watch nvidia-smi to get a constant reading of the memory consumed by the pod. Note that the number shown in the memory box is the Limit and not the Request or Guarantee.

  3. Open the file Untitled.ipynb and move the frame so you can see both tabs

  4. Execute both cells in Untitled.ipynb. This will consume about 3 GB of GPU memory and be well below the 4GB of the GPU Memory Request value.

  5. In the second cell, edit the value after --image-size from 100 to 200 and run the cell. This will increase the GPU memory utilization to about 11.5 GB which is above the Request value.

Next Steps

Manage and monitor your newly created workload using the Workloads table.