Kubernetes v1.36: Introducing Alpha Pod-Level Resource Managers
Kubernetes v1.36 and the Arrival of Pod-Level Resource Managers
The recent release of Kubernetes v1.36 introduces Pod-Level Resource Managers, marking a significant shift in resource management strategies tailored for demanding workloads. This new alpha feature transforms the foundation of how resources are allocated by going beyond the traditional container-based model. It now allows for pod-centric resource specifications through the kubelet's enhanced Topology, CPU, and Memory Managers.
Understanding the Need for Pod-Level Resource Management
For applications where performance is paramount—like machine learning (ML) training and low-latency trading—it’s not just about having enough resources; those resources must be allocated in a way that maximizes performance. Achieving exclusive, NUMA-aligned resources for essential application containers is often critical in these contexts.
However, a typical Kubernetes pod often comprises multiple containers, such as sidecars for logging or monitoring functions. This mix has traditionally presented a dilemma: to ensure exclusive, NUMA-aligned resources for the primary application, you’d often find yourself compelled to allocate those exclusive resources to every container in the pod, which could lead to inefficiencies—especially with lightweight sidecars that consume minimal resources. Failing to provide such allocations would mean sacrificing the pod’s Guaranteed Quality of Service (QoS), thereby diminishing performance outcomes.
Features of Pod-Level Resource Managers
With the implementation of pod-level resource management, Kubernetes now supports a more efficient and adaptable resource allocation strategy. By activating the PodLevelResourceManagers and PodLevelResources features, administrators can establish what can be termed as hybrid resource allocation models. This new approach allows for a balance between flexibility and maintaining the essential NUMA alignment for high-performance workloads.
Practical Applications of This Feature
Let's look at real-world scenarios to understand the practical advantages of this new feature, particularly focusing on the configured Topology Manager's scope:
1. Database Pods with Tightly Coupled Sidecars
Imagine a latency-sensitive database pod that integrates a primary database container alongside auxiliary sidecars, like a metrics exporter and backup agent. When configured using the pod Topology Manager scope, the kubelet aligns resources to the entire pod's budget in one go. The database container can then claim exclusive CPU and memory from a single NUMA node while the remaining resources are pooled for shared use among the sidecars. This ensures that the sidecars benefit from shared resources without encroaching on the dedicated allocation of the primary database application. This setup allows for efficient use of CPU cores without unnecessary waste.
apiVersion: v1
kind: Pod
metadata:
name: tightly-coupled-database
spec:
# Pod-level resources establish overall budget and NUMA alignment size.
resources:
requests:
cpu: "8"
memory: "16Gi"
limits:
cpu: "8"
memory: "16Gi"
initContainers:
- name: metrics-exporter
image: metrics-exporter:v1
restartPolicy: Always
- name: backup-agent
image: backup-agent:v1
restartPolicy: Always
containers:
- name: database
image: database:v1
# Guaranteed container gets exclusive 6 CPU slice from the pod's budget.
# Remaining 2 CPUs and 4Gi memory form the pod shared pool for the sidecars.
resources:
requests:
cpu: "6"
memory: "12Gi"
limits:
cpu: "6"
memory: "12Gi"2. Machine Learning Workloads with Sidecars
Consider a scenario where a pod is tasked with running a GPU-accelerated machine learning training job alongside a generic service mesh sidecar. In this context, using the container Topology Manager scope means the kubelet evaluates each container separately. The ML training container can receive exclusive, NUMA-aligned CPU and memory resources while the service mesh sidecar operates within a broader, general node-wide shared pool. This limits the overall resource consumption while ensuring that only the required containers receive the performance boost they need.
apiVersion: v1
kind: Pod
metadata:
name: ml-workload
spec:
# Pod-level resources establish overall budget constraint.
resources:
requests:
cpu: "4"
memory: "8Gi"
limits:
cpu: "4"
memory: "8Gi"
initContainers:
- name: service-mesh-sidecar
image: service-mesh:v1
restartPolicy: Always
containers:
- name: ml-training
image: ml-training:v1
# Under the 'container' scope, this Guaranteed container receives exclusive,
# NUMA-aligned resources, while the sidecar runs in the node's shared pool.
resources:
requests:
cpu: "3"
memory: "6Gi"
limits:
cpu: "3"
memory: "6Gi"Resource Isolation and Quotas
The implementation of mixed workloads within a pod also introduces a new framework for resource isolation. There are two distinct scenarios:
- Exclusive containers: These containers enjoy the benefit of their CPU slices without being throttled by CFS quota enforcements, allowing them to operate at peak efficiency.
- Containers in the shared pool: In contrast, containers within the pod shared pool are bound by pod-level CFS quotas, ensuring they do not exceed their budgetary constraints.
Activating Pod-Level Resource Managers
If you're ready to harness the power of pod-level resource managers, it requires Kubernetes version 1.36 or later. Here’s how to enable it:
- Switch on the
PodLevelResourcesandPodLevelResourceManagersfeature gates. - Set the Topology Manager with a policy like
best-effort,restricted, orsingle-numa-node. - Assign the Topology Manager scope to
podorcontainervia thetopologyManagerScopefield in theKubeletConfiguration. - Ensure the CPU Manager is set to static policy.
- Configure the Memory Manager with Static policy.
Monitoring and Observability
To help cluster operators keep an eye on these new allocation strategies, Kubernetes has rolled out several new metrics when this feature is enabled:
resource_manager_allocations_total: Keeps track of the total exclusive resource allocations, with a distinction between pod and node-level sources.resource_manager_allocation_errors_total: Monitors errors during exclusive resource allocation, with insights into the intended allocation sources.resource_manager_container_assignments: Provides visibility into the distribution of workload assignments among containers with various assignment types.
Limitations to Keep in Mind
While pod-level resource management opens up exciting avenues, it’s still in the alpha stage. It's essential to familiarize yourself with the limitations and caveats outlined in the official documentation to understand compatibility and potential downgrades.
Next Steps and Feedback Channels
For further insights on this feature's technical aspects and configuration, refer to the official documentation on:
If you're curious about the broader pod-level resource features and allocation techniques, explore:
Your experiences and feedback on this new feature are invaluable as it progresses through its alpha phases. Share any issues or insights via standard Kubernetes channels:
- Join us on Slack: #sig-node
- Participate in the Mailing list
- Check out current community issues on GitHub.