Cluster Authentication

To allow users to securely submit workloads using kubectl, you must configure the Kubernetes API server to authenticate users via the NVIDIA Run:ai identity provider. This is done by adding OpenID Connect (OIDC) flags to the Kubernetes API server configuration on each cluster.

Retrieve Required OIDC Flags

  1. Go to General settings

  2. Navigate to Cluster authentication

  containers:
  - command:
    ...
    - --oidc-client-id=runai
    - --oidc-issuer-url=https://<HOST>/auth/realms/runai
    - --oidc-username-prefix=-
  • --oidc-client-id - A client id that all tokens must be issued for.

  • --oidc-issuer-url - The URL of the NVIDIA Run:ai identity provider

  • --oidc-username-prefix - Prefix prepended to username claims to prevent clashes with existing names (e.g., [email protected]).

Note

These flags must be configured in the API server startup parameters for each cluster in your environment.

Kubernetes Distribution-Specific Configuration

Note

  • Azure Kubernetes Service (AKS) is not supported.

  • For other Kubernetes distributions, refer to specific instructions in the documentation.

Vanilla Kubernetes
  1. Locate the Kubernetes API server configuration file. For vanilla Kubernetes, the configuration file is typically located at: /etc/kubernetes/manifests/kube-apiserver.yaml.

  2. Edit the file. Under the command section, add the required OIDC flags.

  3. Verify that the changes have been applied. After saving the file, the API server should automatically restart since it's managed as a static pod. Confirm that the kube-apiserver-<master-node-name> pod in the kube-system namespace has restarted and is running with the new configuration. You can run the following command to check the pod status:

OpenShift Container Platform (OCP)

No additional configuration is required.

Rancher Kubernetes Engine (RKE1)
  1. Edit the cluster.yml file used by RKE1. If you're using the Rancher UI, follow the instructions here.

  2. Add the required OIDC flags under the kube-api section:

  3. Verify the flags are applied by inspecting the running API server container:

    • Follow the Rancher documentation here to locate the API server container ID.

    • Run the following command:

    • Confirm that the OIDC flags have been added correctly to the container's configuration.

Rancher Kubernetes Engine 2 (RKE2)

If you're using the RKE2 Quickstart:

  1. Edit /etc/rancher/rke2/config.yaml.

  2. Add the required OIDC flags under kube-apiserver-arg, using the format shown below:

If you're using Rancher UI:

  1. Add the required flags during the cluster provisioning process.

  2. Navigate to: Cluster Management > Create, select RKE2, and choose your platform.

  3. In the Cluster Configuration screen, go to: Advanced > Additional API Server Args.

  4. Add the required OIDC flags as <key>=<value> (e.g. oidc-username-prefix=-).

Google Kubernetes Engine (GKE)

To configure researcher authentication on GKE, use Anthos Identity Service and apply the appropriate OIDC configuration.

  1. Install Anthos identity service by running:

  2. Install the yq utility.

  3. Configure the OIDC provider for username-password authentication. Make sure to use the required OIDC flags:

  4. Or, configure the OIDC provider for single-sign-on. Make sure to use the required OIDC flags:

  5. Update the runaiconfig with the Anthos Identity Service endpoint. First, get the external IP of the gke-oidc-envoy service:

  6. Then, patch the runaiconfig to use this endpoint. Replace the below with the actual IP address of the gke-oidc-envoy service:

Elastic Kubernetes Engine (EKS)
  1. In the AWS Console, under EKS, find your cluster.

  2. Go to Configuration and then to Authentication.

  3. Associate a new identity provider. Use the required OIDC flags.

The process can take up to 30 minutes.

NVIDIA Base Command Manager (BCM)
  1. Locate the Kubernetes API server configuration file. For vanilla Kubernetes, the configuration file is typically located at: /etc/kubernetes/manifests/kube-apiserver.yaml.

  2. Edit the file. Under the command section, add the required OIDC flags.

  3. Verify that the changes have been applied. After saving the file, the API server should automatically restart since it's managed as a static pod. Confirm that the kube-apiserver-<master-node-name> pod in the kube-system namespace has restarted and is running with the new configuration. You can run the following command to check the pod status:

Last updated