Hotfixes for Version 2.23
This section provides details on all hotfixes available for version 2.23. Hotfixes are critical updates released between our major and minor versions to address specific issues or vulnerabilities. These updates ensure the system remains secure, stable, and optimized without requiring a full version upgrade.
2.23.29
11/01/2026
RUN-32181
Fixed a security vulnerability related to CVE-2025-32988 with severity HIGH.
2.23.29
11/01/2026
RUN-33448
Fixed an issue where switching between workloads in the workload Details drawer displayed incorrect data, particularly the workload lifespan value.
2.23.29
11/01/2026
RUN-34379
Fixed an issue where image names longer than the display limit were truncated without providing access to the full name.
2.23.29
11/01/2026
RUN-34381
Fixed an issue where the Node column displayed a sort icon but did not actually sort results in the Running / Requested Pods modal.
2.23.29
11/01/2026
RUN-34607
Fixed issues where readiness probes did not work correctly with serving port authorization in single-node Knative inference workloads.
2.23.29
11/01/2026
RUN-34613
Fixed an issue where the Project GET API returned missing limit fields instead of an explicit unlimited value when CPU quotas were enabled.
2.23.29
11/01/2026
RUN-34620
Fixed an issue where, in rare cases, sessions could disconnect due to token refresh handling.
2.23.29
11/01/2026
RUN-34639
Fixed an issue where the Fully free GPU devices column displayed - instead of 0 when no fully free GPU devices were available under fractional GPU allocations.
2.23.29
11/01/2026
RUN-34680
Fixed a security vulnerability related to CVE-2025-58183 with severity HIGH.
2.23.29
11/01/2026
RUN-34720
Fixed a security vulnerability related to CVE-2025-65637 with severity HIGH.
2.23.29
11/01/2026
RUN-35189
Fixed an issue where the --working-dir parameter was ignored for Knative-based inference workloads, causing containers to start in / instead of the specified directory.
2.23.25
21/12/2025
RUN-34791
Fixed a GPU memory swap issue where, under certain circumstances, GPU OOM killer could fail to select and preempt a GPU-consuming workload during out-of-memory or out of system RAM errors.
2.23.24
21/12/2025
RUN-34631
Fixed an issue where the identity manager failed to start when the notification service was disabled.
2.23.24
21/12/2025
RUN-34633
Fixed an issue where department administrators could not include cluster-scope templates in workloads due to incorrect validation of permitted scopes.
2.23.24
21/12/2025
RUN-34758
Fixed an issue where setting a GPU memory limit caused workload creation to fail.
2.23.24
21/12/2025
RUN-34712
Fixed a security vulnerability related to CVE-2025-61729 with severity HIGH.
2.23.23
09/12/2025
RUN-34233
Fixed an issue with refresh-token handling in legacy Grafana dashboards that caused unexpected session logouts.
2.23.23
09/12/2025
RUN-33806
Fixed an issue where containers ran as root instead of a non-privileged user.
2.23.22
08/12/2025
RUN-31856
Fixed a security vulnerability related to CVE-2025-47907 with severity HIGH.
2.23.22
08/12/2025
RUN-33516
Fixed an issue so each access rule created or deleted in a batch action is now audited in the events history.
2.23.22
08/12/2025
RUN-33780
Fixed an issue where apps with the “Viewer” role could not access node metrics, even when they had read permissions at the cluster scope.
2.23.22
08/12/2025
RUN-34429
2.23.21
01/12/2025
RUN-33313
Fixed an issue where the log viewer for distributed workloads displayed only a partial and unsorted list of pods.
2.23.21
01/12/2025
RUN-33802
Fixed an issue that caused distributed inference workloads to become unsynchronized.
2.23.21
01/12/2025
RUN-33862
Fixed an issue where the workloads service could enter a CrashLoopBackOff during upgrade.
2.23.21
01/12/2025
RUN-33947
Fixed an issue where SMTP configurations using the “none” option still sent empty username/password fields. Added the auth_none type to ensure no credentials are sent for passwordless SMTP servers.
2.23.20
19/11/2025
RUN-33642
Fixed an issue where the external-workload-integrator on OpenShift entered a constant reconcile loop, causing high CPU utilization.
2.23.18
18/11/2025
RUN-33613
Fixed missing validations for CPU resources when the CPU quota feature flag was disabled, which caused project and department updates to skip required CPU checks.
2.23.18
18/11/2025
RUN-33634
Fixed an issue where resource name validation failed for hugepage resources by enhancing validation rules to properly support hugepages.
2.23.18
18/11/2025
RUN-32680
Fixed an issue where logs were not displayed in the UI for workloads submitted using the Workloads v2 submission API.
2.23.17
04/11/2025
RUN-33091
Fixed an issue where workloads logs initially loaded older logs instead of the most recent ones.
2.23.17
04/11/2025
RUN-33365
Fixed an issue where selecting an environment asset template in the flexible workload form would not present the the capabilities field correctly.
2.23.17
04/11/2025
RUN-33418
Fixed an issue where the master spec was not inherited when creating a distributed workload from a template.
2.23.16
30/10/2025
RUN-32449
Fixed an issue where a race condition between the NVIDIA Run:ai operator and upgrade/install post hooks caused the upgrade to fail
2.23.16
30/10/2025
RUN-32989
Fixed an issue where the NVIDIA Run:ai operator experienced unusually high CPU utilization after upgrade.
2.23.16
30/10/2025
RUN-33127
Fixed an issue where workload submission in the CLI failed when commands contained special characters.
2.23.16
30/10/2025
RUN-33144
Fixed a security vulnerability related to CVE-2025-62156 with severity HIGH.
2.23.16
30/10/2025
RUN-33235
Fixed a security vulnerability in the Valkey dependency.
2.23.16
30/10/2025
RUN-33388
Fixed an issue where dependency checks did not run properly for clusters installed with a remote control plane.
2.23.16
30/10/2025
RUN-33447
Fixed an issue where the API allowed creating a PVC asset without a claimName when existingPVC=false.
2.23.15
29/10/2025
RUN-33176
Fixed an issue where pagination in the Node Pool page did not respond.
2.23.15
29/10/2025
RUN-33006
Fixed an issue in the CLI installer where the PATH was not configured for all shells. The installer now correctly configures PATH for both zsh and bash.
2.23.14
27/10/2025
RUN-31803
Fixed an issue where the Quota management dashboard occasionally displayed incorrect GPU quota values.
2.23.14
27/10/2025
RUN-32159
Fixed an issue where the updatedBy field of a policy did not show the latest user who updated it.
2.23.14
27/10/2025
RUN-32730
Fixed an issue where incorrect average GPU utilization per project and workload type was displayed in the Projects view charts and tables.
2.23.14
27/10/2025
RUN-33036
Fixed an issue where the grace period preemption field in the UI was limited to 5 minutes, even when the workload policy allowed longer durations.
2.23.14
27/10/2025
RUN-33039
Fixed an issue where setting uid or gid to 0 during environment creation was not allowed.
2.23.14
27/10/2025
RUN-33053
Fixed an issue that caused conflicts with additional built-in Prometheus Operator deployments in OpenShift.
2.23.14
27/10/2025
RUN-33147
Fixed an issue where users with expired refresh tokens (after 24 hours) could not log in, as the token endpoint returned a 400 error.
2.23.14
27/10/2025
RUN-33168
Fixed an issue where certain policy calls failed when at least one unconfigured cluster existed in the system.
2.23.14
27/10/2025
RUN-33177
Fixed an issue where removing the logo in Branding settings displayed an empty square.
2.23.10
08/10/2025
RUN-31738
Fixed an issue where GPU fraction requests were not applied when submitting distributed workloads.
2.23.10
08/10/2025
RUN-32876
Fixed an issue where running a NIM inference workload on a fractional GPU prevented the Triton server from starting, causing inference endpoint requests to fail.
Last updated