Launching Workloads with Dynamic GPU Fractions
This quick start provides a step-by-step walkthrough for running a Jupyter Notebook with dynamic GPU fractions.
NVIDIA Run:ai’s dynamic GPU fractions optimizes GPU utilization by enabling workloads to dynamically adjust their resource usage. It allows users to specify a guaranteed fraction of GPU memory and compute resources with a higher limit that can be dynamically utilized when additional resources are requested.
Prerequisites
Before you start, make sure:
You have created a project or have one created for you.
The project has an assigned quota of at least 0.5 GPU.
Dynamic GPU fractions is enabled.
Note
Flexible workload submission is disabled by default. If unavailable, your administrator must enable it under General Settings → Workloads → Flexible workload submission.
Dynamic GPU fractions is disabled by default in the NVIDIA Run:ai UI. To use dynamic GPU fractions, it must be enabled by your Administrator, under General Settings → Resources → GPU resource optimization.
Step 1: Logging In
Browse to the provided NVIDIA Run:ai user interface and log in with your credentials.
Step 2: Submitting the First Workspace
Go to the Workload manager → Workloads
Click +NEW WORKLOAD and select Workspace
Select under which cluster to create the workload
Select the project in which your workspace will run
Select Start from scratch to launch a new workspace quickly
Enter a name for the workspace (if the name already exists in the project, you will be requested to submit a different name)
Click CONTINUE
In the next step:
Click the load icon. A side pane appears, displaying a list of available environments. To add a new environment:
Click the + icon to create a new environment
Enter quick-start as the name for the environment. The name must be unique.
Enter the Image URL -
gcr.io/run-ai-lab/pytorch-example-jupyter
Tools - Set the connection for your tool:
Click +TOOL
Select Jupyter tool from the list
Set the runtime settings for the environment. Click +COMMAND & ARGUMENTS and add the following:
Enter the command -
start-notebook.sh
Enter the arguments -
--NotebookApp.base_url=/${RUNAI_PROJECT}/${RUNAI_JOB_NAME} --NotebookApp.token=''
Note: If host-based routing is enabled on the cluster, enter the
--NotebookApp.token=''
only.Click CREATE ENVIRONMENT
Select the newly created environment from the side pane
Click the load icon. A side pane appears, displaying a list of available compute resources. To add a new compute resource:
Click the + icon to create a new compute resource
Enter request-limit as the name for the compute resource. The name must be unique.
Set GPU devices per pod - 1
Enable GPU fractioning to set the GPU memory per device:
Select GB - Fraction of a GPU device’s memory
Set the memory Request - 4GB (the workload will allocate 4GB of the GPU memory)
Set the memory Limit - 12GB
Optional: set the CPU compute per pod - 0.1 cores (default)
Optional: set the CPU memory per pod - 100 MB (default)
Select More settings and toggle Increase shared memory size
Click CREATE COMPUTE RESOURCE
Select the newly created compute resource from the side pane
Click CREATE WORKSPACE
Step 3: Submitting the Second Workspace
Go to the Workload manager → Workloads
Click +NEW WORKLOAD and select Workspace
Select the cluster where the previous workspace was created
Select the project where the previous workspace was created
Select Start from scratch to launch a new workspace quickly
Enter a name for the workspace (if the name already exists in the project, you will be requested to submit a different name)
Click CONTINUE
In the next step:
Click the load icon. A side pane appears, displaying a list of available environments. Select the environment created in Step 2.
Click the load icon. A side pane appears, displaying a list of available compute resources. Select the compute resources created in Step 2.
Click CREATE WORKSPACE
Step 4: Connecting to the Jupyter Notebook
Select the newly created workspace with the Jupyter application that you want to connect to
Click CONNECT
Select the Jupyter tool. The selected tool is opened in a new tab on your browser.
Open a terminal and use the
watch nvidia-smi
to get a constant reading of the memory consumed by the pod. Note that the number shown in the memory box is the Limit and not the Request or Guarantee.Open the file
Untitled.ipynb
and move the frame so you can see both tabsExecute both cells in
Untitled.ipynb
. This will consume about 3 GB of GPU memory and be well below the 4GB of the GPU Memory Request value.In the second cell, edit the value after
--image-size
from 100 to 200 and run the cell. This will increase the GPU memory utilization to about 11.5 GB which is above the Request value.
Next Steps
Manage and monitor your newly created workload using the Workloads table.
Last updated