Kubernetes Blog
Kubernetes v1.35: Job Managed By Goes GA
In Kubernetes v1.35, the ability to specify an external Job controller (through .spec.managedBy) graduates to General Availability.
This feature allows external controllers to take full responsibility for Job reconciliation, unlocking powerful scheduling patterns like multi-cluster dispatching with MultiKueue.
Why delegate Job reconciliation?
The primary motivation for this feature is to support multi-cluster batch scheduling architectures, such as MultiKueue.
The MultiKueue architecture distinguishes between a Management Cluster and a pool of Worker Clusters:
- The Management Cluster is responsible for dispatching Jobs but not executing them. It needs to accept Job objects to track status, but it skips the creation and execution of Pods.
- The Worker Clusters receive the dispatched Jobs and execute the actual Pods.
- Users usually interact with the Management Cluster. Because the status is automatically propagated back, they can observe the Job's progress "live" without accessing the Worker Clusters.
- In the Worker Clusters, the dispatched Jobs run as regular Jobs managed by the built-in Job controller, with no
.spec.managedByset.
By using .spec.managedBy, the MultiKueue controller on the Management Cluster can take over the reconciliation of a Job. It copies the status from the "mirror" Job running on the Worker Cluster back to the Management Cluster.
Why not just disable the Job controller? While one could theoretically achieve this by disabling the built-in Job controller entirely, this is often impossible or impractical for two reasons:
- Managed Control Planes: In many cloud environments, the Kubernetes control plane is locked, and users cannot modify controller manager flags.
- Hybrid Cluster Role: Users often need a "hybrid" mode where the Management Cluster dispatches some heavy workloads to remote clusters but still executes smaller or control-plane-related Jobs in the Management Cluster.
.spec.managedByallows this granularity on a per-Job basis.
How .spec.managedBy works
The .spec.managedBy field indicates which controller is responsible for the Job, specifically there are two modes of operation:
- Standard: if unset or set to the reserved value
kubernetes.io/job-controller, the built-in Job controller reconciles the Job as usual (standard behavior). - Delegation: If set to any other value, the built-in Job controller skips reconciliation entirely for that Job.
To prevent orphaned Pods or resource leaks, this field is immutable. You cannot transfer a running Job from one controller to another.
If you are looking into implementing an external controller, be aware that your controller needs to be conformant with the definitions for the Job API. In order to enforce the conformance, a significant part of the effort was to introduce the extensive Job status validation rules. Navigate to the How can you learn more? section for more details.
Ecosystem Adoption
The .spec.managedBy field is rapidly becoming the standard interface for delegating control in the Kubernetes batch ecosystem.
Various custom workload controllers are adding this field (or an equivalent) to allow MultiKueue to take over their reconciliation and orchestrate them across clusters:
While it is possible to use .spec.managedBy to implement a custom Job controller from scratch, we haven't observed that yet. The feature is specifically designed to support delegation patterns, like MultiKueue, without reinventing the wheel.
How can you learn more?
If you want to dig deeper:
Read the user-facing documentation for:
Deep dive into the design history:
- The Kubernetes Enhancement Proposal (KEP) Job's managed-by mechanism including introduction of the extensive Job status validation rules.
- The Kueue KEP for MultiKueue.
Explore how MultiKueue uses .spec.managedBy in practice in the task guide for running Jobs across clusters.
Acknowledgments
As with any Kubernetes feature, a lot of people helped shape this one through design discussions, reviews, test runs, and bug reports.
We would like to thank, in particular:
- Maciej Szulik - for guidance, mentorship, and reviews.
- Filip Křepinský - for guidance, mentorship, and reviews.
Get involved
This work was sponsored by the Kubernetes Batch Working Group in close collaboration with the SIG Apps, and with strong input from the SIG Scheduling community.
If you are interested in batch scheduling, multi-cluster solutions, or further improving the Job API:
- Join us in the Batch WG and SIG Apps meetings.
- Subscribe to the WG Batch Slack channel.
Kubernetes v1.35: Timbernetes (The World Tree Release)
Editors: Aakanksha Bhende, Arujjwal Negi, Chad M. Crowell, Graziano Casto, Swathi Rao
Similar to previous releases, the release of Kubernetes v1.35 introduces new stable, beta, and alpha features. The consistent delivery of high-quality releases underscores the strength of our development cycle and the vibrant support from our community.
This release consists of 60 enhancements, including 17 stable, 19 beta, and 22 alpha features.
There are also some deprecations and removals in this release; make sure to read about those.
Release theme and logo
2025 began in the shimmer of Octarine: The Color of Magic (v1.33) and rode the gusts Of Wind & Will (v1.34). We close the year with our hands on the World Tree, inspired by Yggdrasil, the tree of life that binds many realms. Like any great tree, Kubernetes grows ring by ring and release by release, shaped by the care of a global community.
At its center sits the Kubernetes wheel wrapped around the Earth, grounded by the resilient maintainers, contributors and users who keep showing up. Between day jobs, life changes, and steady open-source stewardship, they prune old APIs, graft new features and keep one of the world’s largest open source projects healthy.
Three squirrels guard the tree: a wizard holding the LGTM scroll for reviewers, a warrior with an axe and Kubernetes shield for the release crews who cut new branches, and a rogue with a lantern for the triagers who bring light to dark issue queues.
Together, they stand in for a much larger adventuring party. Kubernetes v1.35 adds another growth ring to the World Tree, a fresh cut shaped by many hands, many paths and a community whose branches reach higher as its roots grow deeper.
Spotlight on key updates
Kubernetes v1.35 is packed with new features and improvements. Here are a few select updates the Release Team would like to highlight!
Stable: In-place update of Pod resources
Kubernetes has graduated in-place updates for Pod resources to General Availability (GA). This feature allows users to adjust CPU and memory resources without restarting Pods or Containers. Previously, such modifications required recreating Pods, which could disrupt workloads, particularly for stateful or batch applications. Earlier Kubernetes releases allowed you to change only infrastructure resource settings (requests and limits) for existing Pods. The new in-place functionality allows for smoother, nondisruptive vertical scaling, improves efficiency, and can also simplify development.
This work was done as part of KEP #1287 led by SIG Node.
Beta: Pod certificates for workload identity and security
Previously, delivering certificates to pods required external controllers (cert-manager, SPIFFE/SPIRE), CRD orchestration, and Secret management, with rotation handled by sidecars or init containers. Kubernetes v1.35 enables native workload identity with automated certificate rotation, drastically simplifying service mesh and zero-trust architectures.
Now, the kubelet generates keys, requests certificates via PodCertificateRequest, and writes credential bundles directly to the Pod's filesystem. The kube-apiserver enforces node restriction at admission time, eliminating the most common pitfall for third-party signers: accidentally violating node isolation boundaries. This enables pure mTLS flows with no bearer tokens in the issuance path.
This work was done as part of KEP #4317 led by SIG Auth.
Alpha: Node declared features before scheduling
When control planes enable new features but nodes lag behind (permitted by Kubernetes skew policy), the scheduler can place pods requiring those features onto incompatible older nodes.
The node-declaration features framework allows nodes to declare their supported Kubernetes features. With the new alpha feature enabled, a Node reports the features it supports, publishing this information to the control plane via a new .status.declaredFeatures field. Then, the kube-scheduler, admission controllers, and third-party components can use these declarations. For example, you can enforce scheduling and API validation constraints to ensure that Pods run only on compatible nodes.
This work was done as part of KEP #5328 led by SIG Node.
Features graduating to Stable
This is a selection of some of the improvements that are now stable following the v1.35 release.
PreferSameNode traffic distribution
The trafficDistribution field for Services has been updated to provide more explicit control over traffic routing. A new option, PreferSameNode, has been introduced to let services strictly prioritize endpoints on the local node if available, falling back to remote endpoints otherwise.
Simultaneously, the existing PreferClose option has been renamed to PreferSameZone. This change makes the API self-explanatory by explicitly indicating that traffic is preferred within the current availability zone. While PreferClose is preserved for backward compatibility, PreferSameZone is now the standard for zonal routing, ensuring that both node-level and zone-level preferences are clearly distinguished.
This work was done as part of KEP #3015 led by SIG Network.
Job API managed-by mechanism
The Job API now includes a managedBy field that allows an external controller to handle Job status synchronization. This feature, which graduates to stable in Kubernetes v1.35, is primarily driven by MultiKueue, a multi-cluster dispatching system where a Job created in a management cluster is mirrored and executed in a worker cluster, with status updates propagated back. To enable this workflow, the built-in Job controller must not act on a particular Job resource so that the Kueue controller can manage status updates instead.
The goal is to allow clean delegation of Job synchronization to another controller. It does not aim to pass custom parameters to that controller or modify CronJob concurrency policies.
This work was done as part of KEP #4368 led by SIG Apps.
Reliable Pod update tracking with .metadata.generation
Historically, the Pod API lacked the metadata.generation field found in other Kubernetes objects such as Deployments.
Because of this omission, controllers and users had no reliable way to verify whether the kubelet had actually processed the latest changes to a Pod's specification. This ambiguity was particularly problematic for features like In-Place Pod Vertical Scaling, where it was difficult to know exactly when a resource resize request had been enacted.
Kubernetes v1.33 added .metadata.generation fields for Pods, as an alpha feature. That field is now stable in the v1.35 Pod API, which means that every time a Pod's spec is updated, the .metadata.generation value is incremented. As part of this improvement, the Pod API also gained a .status.observedGeneration field, which reports the generation that the kubelet has successfully seen and processed. Pod conditions also each contain their own individual observedGeneration field that clients can report and / or observe.
Because this feature has graduated to stable in v1.35, it is available for all workloads.
This work was done as part of KEP #5067 led by SIG Node.
Configurable NUMA node limit for topology manager
The topology manager historically used a hard-coded limit of 8 for the maximum number of NUMA nodes it can support, preventing state explosion during affinity calculation. (There's an important detail here; a NUMA node is not the same as a Node in the Kubernetes API.) This limit on the number of NUMA nodes prevented Kubernetes from fully utilizing modern high-end servers, which increasingly feature CPU architectures with more than 8 NUMA nodes.
Kubernetes v1.31 introduced a new, beta max-allowable-numa-nodes option to the topology manager policy configuration. In Kubernetes v1.35, that option is stable. Cluster administrators who enable it can use servers with more than 8 NUMA nodes.
Although the configuration option is stable, the Kubernetes community is aware of the poor performance for large NUMA hosts, and there is a proposed enhancement (KEP-5726) that aims to improve on it. You can learn more about this by reading Control Topology Management Policies on a node.
This work was done as part of KEP #4622 led by SIG Node.
New features in Beta
This is a selection of some of the improvements that are now beta following the v1.35 release.
Expose node topology labels via Downward API
Accessing node topology information, such as region and zone, from within a Pod has typically required querying the Kubernetes API server. While functional, this approach creates complexity and security risks by necessitating broad RBAC permissions or sidecar containers just to retrieve infrastructure metadata. Kubernetes v1.35 promotes the capability to expose node topology labels directly via the Downward API to beta.
The kubelet can now inject standard topology labels, such as topology.kubernetes.io/zone and topology.kubernetes.io/region, into Pods as environment variables or projected volume files. The primary benefit is a safer and more efficient way for workloads to be topology-aware. This allows applications to natively adapt to their availability zone or region without dependencies on the API server, strengthening security by upholding the principle of least privilege and simplifying cluster configuration.
Note: Kubernetes now injects available topology labels to every Pod so that they can be used as inputs to the downward API. With the v1.35 upgrade, most cluster administrators will see several new labels added to each Pod; this is expected as part of the design.
This work was done as part of KEP #4742 led by SIG Node.
Native support for storage version migration
In Kubernetes v1.35, the native support for storage version migration graduates to beta and is enabled by default. This move integrates the migration logic directly into the core Kubernetes control plane ("in-tree"), eliminating the dependency on external tools.
Historically, administrators relied on manual "read/write loops"—often piping kubectl get into kubectl replace—to update schemas or re-encrypt data at rest. This method was inefficient and prone to conflicts, especially for large resources like Secrets. With this release, the built-in controller automatically handles update conflicts and consistency tokens, providing a safe, streamlined, and reliable way to ensure stored data remains current with minimal operational overhead.
This work was done as part of KEP #4192 led by SIG API Machinery.
Mutable Volume attach limits
A CSI (Container Storage Interface) driver is a Kubernetes plugin that provides a consistent way for storage systems to be exposed to containerized workloads. The CSINode object records details about all CSI drivers installed on a node. However, a mismatch can arise between the reported and actual attachment capacity on nodes. When volume slots are consumed after a CSI driver starts up, the kube-scheduler may assign stateful pods to nodes without sufficient capacity, ultimately getting stuck in a ContainerCreating state.
Kubernetes v1.35 makes CSINode.spec.drivers[*].allocatable.count mutable so that a node’s available volume attachment capacity can be updated dynamically. It also allows CSI drivers to control how frequently the allocatable.count value is updated on all nodes by introducing a configurable refresh interval, defined through the CSIDriver object. Additionally, it automatically updates CSINode.spec.drivers[*].allocatable.count on detecting a failure in volume attachment due to insufficient capacity. Although this feature graduated to beta in v1.34 with the feature flag MutableCSINodeAllocatableCount disabled by default, it remains in beta for v1.35 to allow time for feedback, but the feature flag is enabled by default.
This work was done as part of KEP #4876 led by SIG Storage.
Opportunistic batching
Historically, the Kubernetes scheduler processes pods sequentially with time complexity of O(num pods × num nodes), which can result in redundant computation for compatible pods. This KEP introduces an opportunistic batching mechanism that aims to improve performance by identifying such compatible Pods via Pod scheduling signature and batching them together, allowing shared filtering and scoring results across them.
The pod scheduling signature ensures that two pods with the same signature are “the same” from a scheduling perspective. It takes into account not only the pod and node attributes, but also the other pods in the system and global data about the pod placement. This means that any pod with the given signature will get the same scores/feasibility results from any arbitrary set of nodes.
The batching mechanism consists of two operations that can be invoked whenever needed - create and nominate. Create leads to the creation of a new set of batch information from the scheduling results of Pods that have a valid signature. Nominate uses the batching information from create to set the nominated node name from a new Pod whose signature matches the canonical Pod’s signature.
This work was done as part of KEP #5598 led by SIG Scheduling.
maxUnavailable for StatefulSets
A StatefulSet runs a group of Pods and maintains a sticky identity for each of those Pods. This is critical for stateful workloads requiring stable network identifiers or persistent storage. When a StatefulSet's .spec.updateStrategy.<type> is set to RollingUpdate, the StatefulSet controller will delete and recreate each Pod in the StatefulSet. It will proceed in the same order as Pod termination (from the largest ordinal to the smallest), updating each Pod one at a time.
Kubernetes v1.24 added a new alpha field to a StatefulSet's rollingUpdate configuration settings, called maxUnavailable. That field wasn't part of the Kubernetes API unless your cluster administrator explicitly opted in.
In Kubernetes v1.35 that field is beta and is available by default. You can use it to define the maximum number of pods that can be unavailable during an update. This setting is most effective in combination with .spec.podManagementPolicy set to Parallel. You can set maxUnavailable as either a positive number (example: 2) or a percentage of the desired number of Pods (example: 10%). If this field is not specified, it will default to 1, to maintain the previous behavior of only updating one Pod at a time. This improvement allows stateful applications (that can tolerate more than one Pod being down) to finish updating faster.
This work was done as part of KEP #961 led by SIG Apps.
Configurable credential plugin policy in kuberc
The optional kuberc file is a way to separate server configurations and cluster credentials from user preferences without disrupting already running CI pipelines with unexpected outputs.
As part of the v1.35 release, kuberc gains additional functionality which allows users to configure credential plugin policy. This change introduces two fields credentialPluginPolicy, which allows or denies all plugins, and allows specifying a list of allowed plugins using credentialPluginAllowlist.
This work was done as part of KEP #3104 as a cooperation between SIG Auth and SIG CLI.
KYAML
YAML is a human-readable format of data serialization. In Kubernetes, YAML files are used to define and configure resources, such as Pods, Services, and Deployments. However, complex YAML is difficult to read. YAML's significant whitespace requires careful attention to indentation and nesting, while its optional string-quoting can lead to unexpected type coercion (see: The Norway Bug). While JSON is an alternative, it lacks support for comments and has strict requirements for trailing commas and quoted keys.
KYAML is a safer and less ambiguous subset of YAML designed specifically for Kubernetes. Introduced as an opt-in alpha feature in v1.34, this feature graduated to beta in Kubernetes v1.35 and has been enabled by default. It can be disabled by setting the environment variable KUBECTL_KYAML=false.
KYAML addresses challenges pertaining to both YAML and JSON. All KYAML files are also valid YAML files. This means you can write KYAML and pass it as an input to any version of kubectl. This also means that you don’t need to write in strict KYAML for the input to be parsed.
This work was done as part of KEP #5295 led by SIG CLI.
Configurable tolerance for HorizontalPodAutoscalers
The Horizontal Pod Autoscaler (HPA) has historically relied on a fixed, global 10% tolerance for scaling actions. A drawback of this hardcoded value was that workloads requiring high sensitivity, such as those needing to scale on a 5% load increase, were often blocked from scaling, while others might oscillate unnecessarily.
With Kubernetes v1.35, the configurable tolerance feature graduates to beta and is enabled by default. This enhancement allows users to define a custom tolerance window on a per-resource basis within the HPA behavior field. By setting a specific tolerance (e.g., lowering it to 0.05 for 5%), operators gain precise control over autoscaling sensitivity, ensuring that critical workloads react quickly to small metric changes, without requiring cluster-wide configuration adjustments.
This work was done as part of KEP #4951 led by SIG Autoscaling.
Support for user namespaces in Pods
Kubernetes is adding support for user namespaces, allowing pods to run with isolated user and group ID mappings instead of sharing host IDs. This means containers can operate as root internally while actually being mapped to an unprivileged user on the host, reducing the risk of privilege escalation in the event of a compromise. The feature improves pod-level security and makes it safer to run workloads that need root inside the container. Over time, support has expanded to both stateless and stateful Pods through id-mapped mounts.
This work was done as part of KEP #127 led by SIG Node.
VolumeSource: OCI artifact and/or image
When creating a Pod, you often need to provide data, binaries, or configuration files for your containers. This meant including the content into the main container image or using a custom init container to download and unpack files into an emptyDir. Both these approaches are still valid. Kubernetes v1.31 added support for the image volume type allowing Pods to declaratively pull and unpack OCI container image artifacts into a volume. This lets you package and deliver data-only artifacts such as configs, binaries, or machine learning models using standard OCI registry tools.
With this feature, you can fully separate your data from your container image and remove the need for extra init containers or startup scripts. The image volume type has been in beta since v1.33 and is enabled by default in v1.35. Please note that using this feature requires a compatible container runtime, such as containerd v2.1 or later.
This work was done as part of KEP #4639 led by SIG Node.
Enforced kubelet credential verification for cached images
The imagePullPolicy: IfNotPresent setting currently allows a Pod to use a container image that is already cached on a node, even if the Pod itself does not possess the credentials to pull that image. A drawback of this behavior is that it creates a security vulnerability in multi-tenant clusters: if a Pod with valid credentials pulls a sensitive private image to a node, a subsequent unauthorized Pod on the same node can access that image simply by relying on the local cache.
This KEP introduces a mechanism where the kubelet enforces credential verification for cached images. Before allowing a Pod to use a locally cached image, the kubelet checks if the Pod has the valid credentials to pull it. This ensures that only authorized workloads can use private images, regardless of whether they are already present on the node, significantly hardening the security posture for shared clusters.
In Kubernetes v1.35, this feature has graduated to beta and is enabled by default. Users can still disable it by setting the KubeletEnsureSecretPulledImages feature gate to false. Additionally, the imagePullCredentialsVerificationPolicy flag allows operators to configure the desired security level, ranging from a mode that prioritizes backward compatibility to a strict enforcement mode that offers maximum security.
This work was done as part of KEP #2535 led by SIG Node.
Fine-grained Container restart rules
Historically, the restartPolicy field was defined strictly at the Pod level, forcing the same behavior on all containers within a Pod. A drawback of this global setting was the lack of granularity for complex workloads, such as AI/ML training jobs. These often required restartPolicy: Never for the Pod to manage job completion, yet individual containers would benefit from in-place restarts for specific, retriable errors (like network glitches or GPU init failures).
Kubernetes v1.35 addresses this by enabling restartPolicy and restartPolicyRules within the container API itself. This allows users to define restart strategies for individual regular and init containers that operate independently of the Pod's overall policy. For example, a container can now be configured to restart automatically only if it exits with a specific error code, avoiding the expensive overhead of rescheduling the entire Pod for a transient failure.
In this release, the feature has graduated to beta and is enabled by default. Users can immediately leverage restartPolicyRules in their container specifications to optimize recovery times and resource utilization for long-running workloads, without altering the broader lifecycle logic of their Pods.
This work was done as part of KEP #5307 led by SIG Node.
CSI driver opt-in for service account tokens via secrets field
Providing ServiceAccount tokens to Container Storage Interface (CSI) drivers has traditionally relied on injecting them into the volume_context field. This approach presents a significant security risk because volume_context is intended for non-sensitive configuration data and is frequently logged in plain text by drivers and debugging tools, potentially leaking credentials.
Kubernetes v1.35 introduces an opt-in mechanism for CSI drivers to receive ServiceAccount tokens via the dedicated secrets field in the NodePublishVolume request. Drivers can now enable this behavior by setting the serviceAccountTokenInSecrets field to true in their CSIDriver object, instructing the kubelet to populate the token securely.
The primary benefit is the prevention of accidental credential exposure in logs and error messages. This change ensures that sensitive workload identities are handled via the appropriate secure channels, aligning with best practices for secret management while maintaining backward compatibility for existing drivers.
This work was done as part of KEP #5538 led by SIG Auth in cooperation with SIG Storage.
Deployment status: count of terminating replicas
Historically, the Deployment status provided details on available and updated replicas but lacked explicit visibility into Pods that were in the process of shutting down. A drawback of this omission was that users and controllers could not easily distinguish between a stable Deployment and one that still had Pods executing cleanup tasks or adhering to long grace periods.
Kubernetes v1.35 promotes the terminatingReplicas field within the Deployment status to beta. This field provides a count of Pods that have a deletion timestamp set but have not yet been removed from the system. This feature is a foundational step in a larger initiative to improve how Deployments handle Pod replacement, laying the groundwork for future policies regarding when to create new Pods during a rollout.
The primary benefit is improved observability for lifecycle management tools and operators. By exposing the number of terminating Pods, external systems can now make more informed decisions such as waiting for a complete shutdown before proceeding with subsequent tasks without needing to manually query and filter individual Pod lists.
This work was done as part of KEP #3973 led by SIG Apps.
New features in Alpha
This is a selection of some of the improvements that are now alpha following the v1.35 release.
Gang scheduling support in Kubernetes
Scheduling interdependent workloads, such as AI/ML training jobs or HPC simulations, has traditionally been challenging because the default Kubernetes scheduler places Pods individually. This often leads to partial scheduling where some Pods start while others wait indefinitely for resources, resulting in deadlocks and wasted cluster capacity.
Kubernetes v1.35 introduces native support for so-called "gang scheduling" via the new Workload API and PodGroup concept. This feature implements an "all-or-nothing" scheduling strategy, ensuring that a defined group of Pods is scheduled only if the cluster has sufficient resources to accommodate the entire group simultaneously.
The primary benefit is improved reliability and efficiency for batch and parallel workloads. By preventing partial deployments, it eliminates resource deadlocks and ensures that expensive cluster capacity is utilized only when a complete job can run, significantly optimizing the orchestration of large-scale data processing tasks.
This work was done as part of KEP #4671 led by SIG Scheduling.
Constrained impersonation
Historically, the impersonate verb in Kubernetes RBAC functioned on an all-or-nothing basis: once a user was authorized to impersonate a target identity, they gained all associated permissions. A drawback of this broad authorization was that it violated the principle of least privilege, preventing administrators from restricting impersonators to specific actions or resources.
Kubernetes v1.35 introduces a new alpha feature, constrained impersonation, which adds a secondary authorization check to the impersonation flow. When enabled via the ConstrainedImpersonation feature gate, the API server verifies not only the basic impersonate permission but also checks if the impersonator is authorized for the specific action using new verb prefixes (e.g., impersonate-on:<mode>:<verb>). This allows administrators to define fine-grained policies—such as permitting a support engineer to impersonate a cluster admin solely to view logs, without granting full administrative access.
This work was done as part of KEP #5284 led by SIG Auth.
Flagz for Kubernetes components
Verifying the runtime configuration of Kubernetes components, such as the API server or kubelet, has traditionally required privileged access to the host node or process arguments. To address this, the /flagz endpoint was introduced to expose command-line options via HTTP. However, its output was initially limited to plain text, making it difficult for automated tools to parse and validate configurations reliably.
In Kubernetes v1.35, the /flagz endpoint has been enhanced to support structured, machine-readable JSON output. Authorized users can now request a versioned JSON response using standard HTTP content negotiation, while the original plain text format remains available for human inspection. This update significantly improves observability and compliance workflows, allowing external systems to programmatically audit component configurations without fragile text parsing or direct infrastructure access.
This work was done as part of KEP #4828 led by SIG Instrumentation.
Statusz for Kubernetes components
Troubleshooting Kubernetes components like the kube-apiserver or kubelet has traditionally involved parsing unstructured logs or text output, which is brittle and difficult to automate. While a basic /statusz endpoint existed previously, it lacked a standardized, machine-readable format, limiting its utility for external monitoring systems.
In Kubernetes v1.35, the /statusz endpoint has been enhanced to support structured, machine-readable JSON output. Authorized users can now request this format using standard HTTP content negotiation to retrieve precise status data—such as version information and health indicators—without relying on fragile text parsing. This improvement provides a reliable, consistent interface for automated debugging and observability tools across all core components.
This work was done as part of KEP #4827 led by SIG Instrumentation.
CCM: watch-based route controller reconciliation using informers
Managing network routes within cloud environments has traditionally relied on the Cloud Controller Manager (CCM) periodically polling the cloud provider's API to verify and update route tables. This fixed-interval reconciliation approach can be inefficient, often generating a high volume of unnecessary API calls and introducing latency between a node state change and the corresponding route update.
For the Kubernetes v1.35 release, the cloud-controller-manager library introduces a watch-based reconciliation strategy for the route controller. Instead of relying on a timer, the controller now utilizes informers to watch for specific Node events, such as additions, deletions, or relevant field updates and triggers route synchronization only when a change actually occurs.
The primary benefit is a significant reduction in cloud provider API usage, which lowers the risk of hitting rate limits and reduces operational overhead. Additionally, this event-driven model improves the responsiveness of the cluster's networking layer by ensuring that route tables are updated immediately following changes in cluster topology.
This work was done as part of KEP #5237 led by SIG Cloud Provider.
Extended toleration operators for threshold-based placement
Kubernetes v1.35 introduces SLA-aware scheduling by enabling workloads to express reliability requirements. The feature adds numeric comparison operators to tolerations, allowing pods to match or avoid nodes based on SLA-oriented taints such as service guarantees or fault-domain quality.
The primary benefit is enhancing the scheduler with more precise placement. Critical workloads can demand higher-SLA nodes, while lower priority workloads can opt into lower SLA ones. This improves utilization and reduces cost without compromising reliability.
This work was done as part of KEP #5471 led by SIG Scheduling.
Mutable container resources when Job is suspended
Running batch workloads often involves trial and error with resource limits. Currently, the Job specification is immutable, meaning that if a Job fails due to an Out of Memory (OOM) error or insufficient CPU, the user cannot simply adjust the resources; they must delete the Job and create a new one, losing the execution history and status.
Kubernetes v1.35 introduces the capability to update resource requests and limits for Jobs that are in a suspended state. Enabled via the MutableJobPodResourcesForSuspendedJobs feature gate, this enhancement allows users to pause a failing Job, modify its Pod template with appropriate resource values, and then resume execution with the corrected configuration.
The primary benefit is a smoother recovery workflow for misconfigured jobs. By allowing in-place corrections during suspension, users can resolve resource bottlenecks without disrupting the Job's lifecycle identity or losing track of its completion status, significantly improving the developer experience for batch processing.
This work was done as part of KEP #5440 led by SIG Apps.
Other notable changes
Continued innovation in Dynamic Resource Allocation (DRA)
The core functionality was graduated to stable in v1.34, with the ability to turn it off. In v1.35 it is always enabled. Several alpha features have also been significantly improved and are ready for testing. We encourage users to provide feedback on these capabilities to help clear the path for their target promotion to beta in upcoming releases.
Extended Resource Requests via DRA
Several functional gaps compared to Extended Resource requests via Device Plugins were addressed, for example scoring and reuse of devices in init containers.
Device Taints and Tolerations
The new "None" effect can be used to report a problem without immediately affecting scheduling or running pod. DeviceTaintRule now provides status information about an ongoing eviction. The "None" effect can be used for a "dry run" before actually evicting pods:
- Create DeviceTaintRule with "effect: None".
- Check the status to see how many pods would be evicted.
- Replace "effect: None" with "effect: NoExecute".
Partitionable Devices
Devices belonging to the same partitionable devices may now be defined in different ResourceSlices. You can read more in the official documentation.
Consumable Capacity, Device Binding Conditions
Several bugs were fixed and/or more tests added. You can learn more about Consumable Capacity and Binding Conditions in the official documentation.
Comparable resource version semantics
Kubernetes v1.35 changes the way that clients are allowed to interpret resource versions.
Before v1.35, the only supported comparison that clients could make was to check for string equality: if two resource versions were equal, they were the same. Clients could also provide a resource version to the API server and ask the control plane to do internal comparisons, such as streaming all events since a particular resource version.
In v1.35, all in-tree resource versions meet a new stricter definition: the values are a special form of decimal number. And, because they can be compared, clients can do their own operations to compare two different resource versions. For example, this means that a client reconnecting after a crash can detect when it has lost updates, as distinct from the case where there has been an update but no lost changes in the meantime.
This change in semantics enables other important use cases such as storage version migration, performance improvements to informers (a client helper concept), and controller reliability. All of those cases require knowing whether one resource version is newer than another.
This work was done as part of KEP #5504 led by SIG API Machinery.
Graduations, deprecations, and removals in v1.35
Graduations to stable
This lists all the features that graduated to stable (also known as general availability). For a full list of updates including new features and graduations from alpha to beta, see the release notes.
This release includes a total of 15 enhancements promoted to stable:
- Add CPUManager policy option to restrict reservedSystemCPUs to system daemons and interrupt processing
- Pod Generation
- Invariant Testing
- In-Place Update of Pod Resources
- Fine-grained SupplementalGroups control
- Add support for a drop-in kubelet configuration directory
- Remove gogo protobuf dependency for Kubernetes API types
- kubelet image GC after a maximum age
- Kubelet limit of Parallel Image Pulls
- Add a TopologyManager policy option for MaxAllowableNUMANodes
- Include kubectl command metadata in http request headers
- PreferSameNode Traffic Distribution (formerly PreferLocal traffic policy / Node-level topology)
- Job API managed-by mechanism
- Transition from SPDY to WebSockets
Deprecations, removals and community updates
As Kubernetes develops and matures, features may be deprecated, removed, or replaced with better ones to improve the project's overall health. See the Kubernetes deprecation and removal policy for more details on this process. Kubernetes v1.35 includes a couple of deprecations.
Ingress NGINX retirement
For years, the Ingress NGINX controller has been a popular choice for routing traffic into Kubernetes clusters. It was flexible, widely adopted, and served as the standard entry point for countless applications.
However, maintaining the project has become unsustainable. With a severe shortage of maintainers and mounting technical debt, the community recently made the difficult decision to retire it. This isn't strictly part of the v1.35 release, but it's such an important change that we wanted to highlight it here.
Consequently, the Kubernetes project announced that Ingress NGINX will receive only best-effort maintenance until March 2026. After this date, it will be archived with no further updates. The recommended path forward is to migrate to the Gateway API, which offers a more modern, secure, and extensible standard for traffic management.
You can find more in the official blog post.
Removal of cgroup v1 support
When it comes to managing resources on Linux nodes, Kubernetes has historically relied on cgroups (control groups). While the original cgroup v1 was functional, it was often inconsistent and limited. That is why Kubernetes introduced support for cgroup v2 back in v1.25, offering a much cleaner, unified hierarchy and better resource isolation.
Because cgroup v2 is now the modern standard, Kubernetes is ready to retire the legacy cgroup v1 support in v1.35. This is an important notice for cluster administrators: if you are still running nodes on older Linux distributions that don't support cgroup v2, your kubelet will fail to start. To avoid downtime, you will need to migrate those nodes to systems where cgroup v2 is enabled.
To learn more, read about cgroup v2;
you can also track the switchover work via KEP-5573: Remove cgroup v1 support.
Deprecation of ipvs mode in kube-proxy
Years ago, Kubernetes adopted the ipvs mode in kube-proxy to provide faster load balancing than the standard iptables. While it offered a performance boost, keeping it in sync with evolving networking requirements created too much technical debt and complexity.
Because of this maintenance burden, Kubernetes v1.35 deprecates ipvs mode. Although the mode remains available in this release, kube-proxy will now emit a warning on startup when configured to use it. The goal is to streamline the codebase and focus on modern standards. For Linux nodes, you should begin transitioning to nftables, which is now the recommended replacement.
You can find more in KEP-5495: Deprecate ipvs mode in kube-proxy.
Final call for containerd v1.X
While Kubernetes v1.35 still supports containerd 1.7 and other LTS releases, this is the final version with such support. The SIG Node community has designated v1.35 as the last release to support the containerd v1.X series.
This serves as an important reminder: before upgrading to the next Kubernetes version, you must switch to containerd 2.0 or later. To help identify which nodes need attention, you can monitor the kubelet_cri_losing_support metric within your cluster.
You can find more in the official blog post or in KEP-4033: Discover cgroup driver from CRI.
Improved Pod stability during kubelet restarts
Previously, restarting the kubelet service often caused a temporary disruption in Pod status. During a restart, the kubelet would reset container states, causing healthy Pods to be marked as NotReady and removed from load balancers, even if the application itself was still running correctly.
To address this reliability issue, this behavior has been corrected to ensure seamless node maintenance. The kubelet now properly restores the state of existing containers from the runtime upon startup. This ensures that your workloads remain Ready and traffic continues to flow uninterrupted during kubelet restarts or upgrades.
You can find more in KEP-4781: Fix inconsistent container ready state after kubelet restart.
Release notes
Check out the full details of the Kubernetes v1.35 release in our release notes.
Availability
Kubernetes v1.35 is available for download on GitHub or on the Kubernetes download page.
To get started with Kubernetes, check out these interactive tutorials or run local Kubernetes clusters using minikube. You can also easily install v1.35 using kubeadm.
Release team
Kubernetes is only possible with the support, commitment, and hard work of its community. Each release team is made up of dedicated community volunteers who work together to build the many pieces that make up the Kubernetes releases you rely on. This requires the specialized skills of people from all corners of our community, from the code itself to its documentation and project management.
We honor the memory of Han Kang, a long-time contributor and respected engineer whose technical excellence and infectious enthusiasm left a lasting impact on the Kubernetes community. Han was a significant force within SIG Instrumentation and SIG API Machinery, earning a 2021 Kubernetes Contributor Award for his critical work and sustained commitment to the project's core stability. Beyond his technical contributions, Han was deeply admired for his generosity as a mentor and his passion for building connections among people. He was known for "opening doors" for others, whether guiding new contributors through their first pull requests or supporting colleagues with patience and kindness. Han’s legacy lives on through the engineers he inspired, the robust systems he helped build, and the warm, collaborative spirit he fostered within the cloud native ecosystem.
We would like to thank the entire Release Team for the hours spent hard at work to deliver the Kubernetes v1.35 release to our community. The Release Team's membership ranges from first-time shadows to returning team leads with experience forged over several release cycles. We are incredibly grateful to our Release Lead, Drew Hagen, whose hands-on guidance and vibrant energy not only navigated us through complex challenges but also fueled the community spirit behind this successful release.
Project velocity
The CNCF K8s DevStats project aggregates a number of interesting data points related to the velocity of Kubernetes and various sub-projects. This includes everything from individual contributions to the number of companies that are contributing and is an illustration of the depth and breadth of effort that goes into evolving this ecosystem.
During the v1.35 release cycle, which spanned 14 weeks from 15th September 2025 to 17th December 2025, Kubernetes received contributions from as many as 85 different companies and 419 individuals. In the wider cloud native ecosystem, the figure goes up to 281 companies, counting 1769 total contributors.
Note that "contribution" counts when someone makes a commit, code review, comment, creates an issue or PR, reviews a PR (including blogs and documentation) or comments on issues and PRs.
If you are interested in contributing, visit Getting Started on our contributor website.
Sources for this data:
Events update
Explore upcoming Kubernetes and cloud native events, including KubeCon + CloudNativeCon, KCD, and other notable conferences worldwide. Stay informed and get involved with the Kubernetes community!
February 2026
- KCD - Kubernetes Community Days: New Delhi: Feb 21, 2026 | New Delhi, India
- KCD - Kubernetes Community Days: Guadalajara: Feb 23, 2026 | Guadalajara, Mexico
March 2026
- KubeCon + CloudNativeCon Europe 2026: Mar 23-26, 2026 | Amsterdam, Netherlands
May 2026
- KCD - Kubernetes Community Days: Toronto: May 13, 2026 | Toronto, Canada
- KCD - Kubernetes Community Days: Helsinki: May 20, 2026 | Helsinki, Finland
June 2026
- KubeCon + CloudNativeCon China 2026: Jun 10-11, 2026 | Hong Kong
- KubeCon + CloudNativeCon India 2026: Jun 18-19, 2026 | Mumbai, India
- KCD - Kubernetes Community Days: Kuala Lumpur: Jun 27, 2026 | Kuala Lumpur, Malaysia
July 2026
- KubeCon + CloudNativeCon Japan 2026: Jul 29-30, 2026 | Yokohama, Japan
You can find the latest event details here.
Upcoming release webinar
Join members of the Kubernetes v1.35 Release Team on Wednesday, January 14, 2026, at 5:00 PM (UTC) to learn about the release highlights of this release. For more information and registration, visit the event page on the CNCF Online Programs site.
Get involved
The simplest way to get involved with Kubernetes is by joining one of the many Special Interest Groups (SIGs) that align with your interests. Have something you’d like to broadcast to the Kubernetes community? Share your voice at our weekly community meeting, and through the channels below. Thank you for your continued feedback and support.
- Follow us on Bluesky @Kubernetesio for the latest updates
- Join the community discussion on Discuss
- Join the community on Slack
- Post questions (or answer questions) on Stack Overflow
- Share your Kubernetes story
- Read more about what’s happening with Kubernetes on the blog
- Learn more about the Kubernetes Release Team
Kubernetes v1.35 Sneak Peek
As the release of Kubernetes v1.35 approaches, the Kubernetes project continues to evolve. Features may be deprecated, removed, or replaced to improve the project's overall health. This blog post outlines planned changes for the v1.35 release that the release team believes you should be aware of to ensure the continued smooth operation of your Kubernetes cluster(s), and to keep you up to date with the latest developments. The information below is based on the current status of the v1.35 release and is subject to change before the final release date.
Deprecations and removals for Kubernetes v1.35
cgroup v1 support
On Linux nodes, container runtimes typically rely on cgroups (short for "control groups").
Support for using cgroup v2 has been stable in Kubernetes since v1.25, providing an alternative to the original v1 cgroup support. While cgroup v1 provided the initial resource control mechanism, it suffered from well-known
inconsistencies and limitations. Adding support for cgroup v2 allowed use of a unified control group hierarchy, improved resource isolation, and served as the foundation for modern features, making legacy cgroup v1 support ready for removal.
The removal of cgroup v1 support will only impact cluster administrators running nodes on older Linux distributions that do not support cgroup v2; on those nodes, the kubelet will fail to start. Administrators must migrate their nodes to systems with cgroup v2 enabled. More details on compatibility requirements will be available in a blog post soon after the v1.35 release.
To learn more, read about cgroup v2;
you can also track the switchover work via KEP-5573: Remove cgroup v1 support.
Deprecation of ipvs mode in kube-proxy
Many releases ago, the Kubernetes project implemented an ipvs mode in kube-proxy. It was adopted as a way to provide high-performance service load balancing, with better performance than the existing iptables mode. However, maintaining feature parity between ipvs and other kube-proxy modes became difficult, due to technical complexity and diverging requirements. This created significant technical debt and made the ipvs backend impractical to support alongside newer networking capabilities.
The Kubernetes project intends to deprecate kube-proxy ipvs mode in the v1.35 release, to streamline the kube-proxy codebase. For Linux nodes, the recommended kube-proxy mode is already nftables.
You can find more in KEP-5495: Deprecate ipvs mode in kube-proxy
Kubernetes is deprecating containerd v1.y support
While Kubernetes v1.35 still supports containerd 1.7 and other LTS releases of containerd, as a consequence of automated cgroup driver detection, the Kubernetes SIG Node community has formally agreed upon a final support timeline for containerd v1.X. Kubernetes v1.35 is the last release to offer this support (aligned with containerd 1.7 EOL).
This is a final warning that if you are using containerd 1.X, you must switch to 2.0 or later before upgrading Kubernetes to the next version. You are able to monitor the kubelet_cri_losing_support metric to determine if any nodes in your cluster are using a containerd version that will soon be unsupported.
You can find more in the official blog post or in KEP-4033: Discover cgroup driver from CRI
Featured enhancements of Kubernetes v1.35
The following enhancements are some of those likely to be included in the v1.35 release. This is not a commitment, and the release content is subject to change.
Node declared features
When scheduling Pods, Kubernetes uses node labels, taints, and tolerations to match workload requirements with node capabilities. However, managing feature compatibility becomes challenging during cluster upgrades due to version skew between the control plane and nodes. This can lead to Pods being scheduled on nodes that lack required features, resulting in runtime failures.
The node declared features framework will introduce a standard mechanism for nodes to declare their supported Kubernetes features. With the new alpha feature enabled, a Node reports the features it can support, publishing this information to the control plane through a new .status.declaredFeatures field. Then, the kube-scheduler, admission controllers and third-party components can use these declarations. For example, you can enforce scheduling and API validation constraints, ensuring that Pods run only on compatible nodes.
This approach reduces manual node labeling, improves scheduling accuracy, and prevents incompatible pod placements proactively. It also integrates with the Cluster Autoscaler for informed scale-up decisions. Feature declarations are temporary and tied to Kubernetes feature gates, enabling safe rollout and cleanup.
Targeting alpha in v1.35, node declared features aims to solve version skew scheduling issues by making node capabilities explicit, enhancing reliability and cluster stability in heterogeneous version environments.
To learn more about this before the official documentation is published, you can read KEP-5328.
In-place update of Pod resources
Kubernetes is graduating in-place updates for Pod resources to General Availability (GA). This feature allows users to adjust cpu and memory resources without restarting Pods or Containers. Previously, such modifications required recreating Pods, which could disrupt workloads, particularly for stateful or batch applications.
Previous Kubernetes releases already allowed you to change infrastructure resources settings (requests and limits) for existing Pods. This allows for smoother vertical scaling, improves efficiency, and can also simplify solution development.
The Container Runtime Interface (CRI) has also been improved, extending the UpdateContainerResources API for Windows and future runtimes while allowing ContainerStatus to report real-time resource configurations. Together, these changes make scaling in Kubernetes faster, more flexible, and disruption-free.
The feature was introduced as alpha in v1.27, graduated to beta in v1.33, and is targeting graduation to stable in v1.35.
You can find more in KEP-1287: In-place Update of Pod Resources
Pod certificates
When running microservices, Pods often require a strong cryptographic identity to authenticate with each other using mutual TLS (mTLS). While Kubernetes provides Service Account tokens, these are designed for authenticating to the API server, not for general-purpose workload identity.
Before this enhancement, operators had to rely on complex, external projects like SPIFFE/SPIRE or cert-manager to provision and rotate certificates for their workloads. But what if you could issue a unique, short-lived certificate to your Pods natively and automatically? KEP-4317 is designed to enable such native workload identity. It opens up various possibilities for securing pod-to-pod communication by allowing the kubelet to request and mount certificates for a Pod via a projected volume.
This provides a built-in mechanism for workload identity, complete with automated certificate rotation, significantly simplifying the setup of service meshes and other zero-trust network policies. This feature was introduced as alpha in v1.34 and is targeting beta in v1.35.
You can find more in KEP-4317: Pod Certificates
Numeric values for taints
Kubernetes is enhancing taints and tolerations by adding numeric comparison operators, such as Gt (Greater Than) and Lt (Less Than).
Previously, tolerations supported only exact (Equal) or existence (Exists) matches, which were not suitable for numeric properties such as reliability SLAs.
With this change, a Pod can use a toleration to "opt-in" to nodes that meet a specific numeric threshold. For example, a Pod can require a Node with an SLA taint value greater than 950 (operator: Gt, value: "950").
This approach is more powerful than Node Affinity because it supports the NoExecute effect, allowing Pods to be automatically evicted if a node's numeric value drops below the tolerated threshold.
You can find more in KEP-5471: Enable SLA-based Scheduling
User namespaces
When running Pods, you can use securityContext to drop privileges, but containers inside the pod often still run as root (UID 0). This simplicity poses a significant challenge, as that container UID 0 maps directly to the host's root user.
Before this enhancement, a container breakout vulnerability could grant an attacker full root access to the node. But what if you could dynamically remap the container's root user to a safe, unprivileged user on the host? KEP-127 specifically allows such native support for Linux User Namespaces. It opens up various possibilities for pod security by isolating container and host user/group IDs. This allows a process to have root privileges (UID 0) within its namespace, while running as a non-privileged, high-numbered UID on the host.
Released as alpha in v1.25 and beta in v1.30, this feature continues to progress through beta maturity, paving the way for truly "rootless" containers that drastically reduce the attack surface for a whole class of security vulnerabilities.
You can find more in KEP-127: User Namespaces
Support for mounting OCI images as volumes
When provisioning a Pod, you often need to bundle data, binaries, or configuration files for your containers.
Before this enhancement, people often included that kind of data directly into the main container image, or required a custom init container to download and unpack files into an emptyDir. You can still take either of those approaches, of course.
But what if you could populate a volume directly from a data-only artifact in an OCI registry, just like pulling a container image? Kubernetes v1.31 added support for the image volume type, allowing Pods to pull and unpack OCI container image artifacts into a volume declaratively.
This allows for seamless distribution of data, binaries, or ML models using standard registry tooling, completely decoupling data from the container image and eliminating the need for complex init containers or startup scripts. This volume type has been in beta since v1.33 and will likely be enabled by default in v1.35.
You can try out the beta version of image volumes, or you can learn more about the plans from KEP-4639: OCI Volume Source.
Want to know more?
New features and deprecations are also announced in the Kubernetes release notes. We will formally announce what's new in Kubernetes v1.35 as part of the CHANGELOG for that release.
The Kubernetes v1.35 release is planned for December 17, 2025. Stay tuned for updates!
You can also see the announcements of changes in the release notes for:
Get involved
The simplest way to get involved with Kubernetes is by joining one of the many Special Interest Groups (SIGs) that align with your interests. Have something you’d like to broadcast to the Kubernetes community? Share your voice at our weekly community meeting, and through the channels below. Thank you for your continued feedback and support.
- Follow us on Bluesky @kubernetes.io for the latest updates
- Join the community discussion on Discuss
- Join the community on Slack
- Post questions (or answer questions) on Server Fault or Stack Overflow
- Share your Kubernetes story
- Read more about what’s happening with Kubernetes on the blog
- Learn more about the Kubernetes Release Team
Kubernetes Configuration Good Practices
Configuration is one of those things in Kubernetes that seems small until it's not. Configuration is at the heart of every Kubernetes workload. A missing quote, a wrong API version or a misplaced YAML indent can ruin your entire deploy.
This blog brings together tried-and-tested configuration best practices. The small habits that make your Kubernetes setup clean, consistent and easier to manage. Whether you are just starting out or already deploying apps daily, these are the little things that keep your cluster stable and your future self sane.
This blog is inspired by the original Configuration Best Practices page, which has evolved through contributions from many members of the Kubernetes community.
General configuration practices
Use the latest stable API version
Kubernetes evolves fast. Older APIs eventually get deprecated and stop working. So, whenever you are defining resources, make sure you are using the latest stable API version. You can always check with
kubectl api-resources
This simple step saves you from future compatibility issues.
Store configuration in version control
Never apply manifest files directly from your desktop. Always keep them in a version control system like Git, it's your safety net. If something breaks, you can instantly roll back to a previous commit, compare changes or recreate your cluster setup without panic.
Write configs in YAML not JSON
Write your configuration files using YAML rather than JSON. Both work technically, but YAML is just easier for humans. It's cleaner to read and less noisy and widely used in the community.
YAML has some sneaky gotchas with boolean values:
Use only true or false.
Don't write yes, no, on or off.
They might work in one version of YAML but break in another. To be safe, quote anything that looks like a Boolean (for example "yes").
Keep configuration simple and minimal
Avoid setting default values that are already handled by Kubernetes. Minimal manifests are easier to debug, cleaner to review and less likely to break things later.
Group related objects together
If your Deployment, Service and ConfigMap all belong to one app, put them in a single manifest file.
It's easier to track changes and apply them as a unit.
See the Guestbook all-in-one.yaml file for an example of this syntax.
You can even apply entire directories with:
kubectl apply -f configs/
One command and boom everything in that folder gets deployed.
Add helpful annotations
Manifest files are not just for machines, they are for humans too. Use annotations to describe why something exists or what it does. A quick one-liner can save hours when debugging later and also allows better collaboration.
The most helpful annotation to set is kubernetes.io/description. It's like using comment, except that it gets copied into the API so that everyone else can see it even after you deploy.
Managing Workloads: Pods, Deployments, and Jobs
A common early mistake in Kubernetes is creating Pods directly. Pods work, but they don't reschedule themselves if something goes wrong.
Naked Pods (Pods not managed by a controller, such as Deployment or a StatefulSet) are fine for testing, but in real setups, they are risky.
Why? Because if the node hosting that Pod dies, the Pod dies with it and Kubernetes won't bring it back automatically.
Use Deployments for apps that should always be running
A Deployment, which both creates a ReplicaSet to ensure that the desired number of Pods is always available, and specifies a strategy to replace Pods (such as RollingUpdate), is almost always preferable to creating Pods directly. You can roll out a new version, and if something breaks, roll back instantly.
Use Jobs for tasks that should finish
A Job is perfect when you need something to run once and then stop like database migration or batch processing task. It will retry if the pods fails and report success when it's done.
Service Configuration and Networking
Services are how your workloads talk to each other inside (and sometimes outside) your cluster. Without them, your pods exist but can't reach anyone. Let's make sure that doesn't happen.
Create Services before workloads that use them
When Kubernetes starts a Pod, it automatically injects environment variables for existing Services. So, if a Pod depends on a Service, create a Service before its corresponding backend workloads (Deployments or StatefulSets), and before any workloads that need to access it.
For example, if a Service named foo exists, all containers will get the following variables in their initial environment:
FOO_SERVICE_HOST=<the host the Service runs on>
FOO_SERVICE_PORT=<the port the Service runs on>
DNS based discovery doesn't have this problem, but it's a good habit to follow anyway.
Use DNS for Service discovery
If your cluster has the DNS add-on (most do), every Service automatically gets a DNS entry. That means you can access it by name instead of IP:
curl http://my-service.default.svc.cluster.local
It's one of those features that makes Kubernetes networking feel magical.
Avoid hostPort and hostNetwork unless absolutely necessary
You'll sometimes see these options in manifests:
hostPort: 8080
hostNetwork: true
But here's the thing:
They tie your Pods to specific nodes, making them harder to schedule and scale. Because each <hostIP, hostPort, protocol> combination must be unique. If you don't specify the hostIP and protocol explicitly, Kubernetes will use 0.0.0.0 as the default hostIP and TCP as the default protocol.
Unless you're debugging or building something like a network plugin, avoid them.
If you just need local access for testing, try kubectl port-forward:
kubectl port-forward deployment/web 8080:80
See Use Port Forwarding to access applications in a cluster to learn more.
Or if you really need external access, use a type: NodePort Service. That's the safer, Kubernetes-native way.
Use headless Services for internal discovery
Sometimes, you don't want Kubernetes to load balance traffic. You want to talk directly to each Pod. That's where headless Services come in.
You create one by setting clusterIP: None.
Instead of a single IP, DNS gives you a list of all Pods IPs, perfect for apps that manage connections themselves.
Working with labels effectively
Labels are key/value pairs that are attached to objects such as Pods. Labels help you organize, query and group your resources. They don't do anything by themselves, but they make everything else from Services to Deployments work together smoothly.
Use semantics labels
Good labels help you understand what's what, even after months later. Define and use labels that identify semantic attributes of your application or Deployment. For example;
labels:
app.kubernetes.io/name: myapp
app.kubernetes.io/component: web
tier: frontend
phase: test
app.kubernetes.io/name: what the app istier: which layer it belongs to (frontend/backend)phase: which stage it's in (test/prod)
You can then use these labels to make powerful selectors. For example:
kubectl get pods -l tier=frontend
This will list all frontend Pods across your cluster, no matter which Deployment they came from. Basically you are not manually listing Pod names; you are just describing what you want. See the guestbook app for examples of this approach.
Use common Kubernetes labels
Kubernetes actually recommends a set of common labels. It's a standardized way to name things across your different workloads or projects. Following this convention makes your manifests cleaner, and it means that tools such as Headlamp, dashboard, or third-party monitoring systems can all automatically understand what's running.
Manipulate labels for debugging
Since controllers (like ReplicaSets or Deployments) use labels to manage Pods, you can remove a label to “detach” a Pod temporarily.
Example:
kubectl label pod mypod app-
The app- part removes the label key app.
Once that happens, the controller won’t manage that Pod anymore.
It’s like isolating it for inspection, a “quarantine mode” for debugging. To interactively remove or add labels, use kubectl label.
You can then check logs, exec into it and once done, delete it manually. That’s a super underrated trick every Kubernetes engineer should know.
Handy kubectl tips
These small tips make life much easier when you are working with multiple manifest files or clusters.
Apply entire directories
Instead of applying one file at a time, apply the whole folder:
# Using server-side apply is also a good practice
kubectl apply -f configs/ --server-side
This command looks for .yaml, .yml and .json files in that folder and applies them all together.
It's faster, cleaner and helps keep things grouped by app.
Use label selectors to get or delete resources
You don't always need to type out resource names one by one. Instead, use selectors to act on entire groups at once:
kubectl get pods -l app=myapp
kubectl delete pod -l phase=test
It's especially useful in CI/CD pipelines, where you want to clean up test resources dynamically.
Quickly create Deployments and Services
For quick experiments, you don't always need to write a manifest. You can spin up a Deployment right from the CLI:
kubectl create deployment webapp --image=nginx
Then expose it as a Service:
kubectl expose deployment webapp --port=80
This is great when you just want to test something before writing full manifests. Also, see Use a Service to Access an Application in a cluster for an example.
Conclusion
Cleaner configuration leads to calmer cluster administrators. If you stick to a few simple habits: keep configuration simple and minimal, version-control everything, use consistent labels, and avoid relying on naked Pods, you'll save yourself hours of debugging down the road.
The best part? Clean configurations stay readable. Even after months, you or anyone on your team can glance at them and know exactly what’s happening.
Ingress NGINX Retirement: What You Need to Know
To prioritize the safety and security of the ecosystem, Kubernetes SIG Network and the Security Response Committee are announcing the upcoming retirement of Ingress NGINX. Best-effort maintenance will continue until March 2026. Afterward, there will be no further releases, no bugfixes, and no updates to resolve any security vulnerabilities that may be discovered. Existing deployments of Ingress NGINX will continue to function and installation artifacts will remain available.
We recommend migrating to one of the many alternatives. Consider migrating to Gateway API, the modern replacement for Ingress. If you must continue using Ingress, many alternative Ingress controllers are listed in the Kubernetes documentation. Continue reading for further information about the history and current state of Ingress NGINX, as well as next steps.
About Ingress NGINX
Ingress is the original user-friendly way to direct network traffic to workloads running on Kubernetes. (Gateway API is a newer way to achieve many of the same goals.) In order for an Ingress to work in your cluster, there must be an Ingress controller running. There are many Ingress controller choices available, which serve the needs of different users and use cases. Some are cloud-provider specific, while others have more general applicability.
Ingress NGINX was an Ingress controller, developed early in the history of the Kubernetes project as an example implementation of the API. It became very popular due to its tremendous flexibility, breadth of features, and independence from any particular cloud or infrastructure provider. Since those days, many other Ingress controllers have been created within the Kubernetes project by community groups, and by cloud native vendors. Ingress NGINX has continued to be one of the most popular, deployed as part of many hosted Kubernetes platforms and within innumerable independent users’ clusters.
History and Challenges
The breadth and flexibility of Ingress NGINX has caused maintenance challenges. Changing expectations about cloud native software have also added complications. What were once considered helpful options have sometimes come to be considered serious security flaws, such as the ability to add arbitrary NGINX configuration directives via the "snippets" annotations. Yesterday’s flexibility has become today’s insurmountable technical debt.
Despite the project’s popularity among users, Ingress NGINX has always struggled with insufficient or barely-sufficient maintainership. For years, the project has had only one or two people doing development work, on their own time, after work hours and on weekends. Last year, the Ingress NGINX maintainers announced their plans to wind down Ingress NGINX and develop a replacement controller together with the Gateway API community. Unfortunately, even that announcement failed to generate additional interest in helping maintain Ingress NGINX or develop InGate to replace it. (InGate development never progressed far enough to create a mature replacement; it will also be retired.)
Current State and Next Steps
Currently, Ingress NGINX is receiving best-effort maintenance. SIG Network and the Security Response Committee have exhausted our efforts to find additional support to make Ingress NGINX sustainable. To prioritize user safety, we must retire the project.
In March 2026, Ingress NGINX maintenance will be halted, and the project will be retired. After that time, there will be no further releases, no bugfixes, and no updates to resolve any security vulnerabilities that may be discovered. The GitHub repositories will be made read-only and left available for reference.
Existing deployments of Ingress NGINX will not be broken. Existing project artifacts such as Helm charts and container images will remain available.
In most cases, you can check whether you use Ingress NGINX by running kubectl get pods \--all-namespaces \--selector app.kubernetes.io/name=ingress-nginx with cluster administrator permissions.
We would like to thank the Ingress NGINX maintainers for their work in creating and maintaining this project–their dedication remains impressive. This Ingress controller has powered billions of requests in datacenters and homelabs all around the world. In a lot of ways, Kubernetes wouldn’t be where it is without Ingress NGINX, and we are grateful for so many years of incredible effort.
SIG Network and the Security Response Committee recommend that all Ingress NGINX users begin migration to Gateway API or another Ingress controller immediately. Many options are listed in the Kubernetes documentation: Gateway API, Ingress. Additional options may be available from vendors you work with.
Announcing the 2025 Steering Committee Election Results
The 2025 Steering Committee Election is now complete. The Kubernetes Steering Committee consists of 7 seats, 4 of which were up for election in 2025. Incoming committee members serve a term of 2 years, and all members are elected by the Kubernetes Community.
The Steering Committee oversees the governance of the entire Kubernetes project. With that great power comes great responsibility. You can learn more about the steering committee’s role in their charter.
Thank you to everyone who voted in the election; your participation helps support the community’s continued health and success.
Results
Congratulations to the elected committee members whose two year terms begin immediately (listed in alphabetical order by GitHub handle):
- Kat Cosgrove (@katcosgrove), Minimus
- Paco Xu (@pacoxu), DaoCloud
- Rita Zhang (@ritazh), Microsoft
- Maciej Szulik (@soltysh), Defense Unicorns
They join continuing members:
- Antonio Ojea (@aojea), Google
- Benjamin Elder (@BenTheElder), Google
- Sascha Grunert (@saschagrunert), Red Hat
Maciej Szulik and Paco Xu are returning Steering Committee Members.
Big thanks!
Thank you and congratulations on a successful election to this round’s election officers:
- Christoph Blecker (@cblecker)
- Nina Polshakova (@npolshakova)
- Sreeram Venkitesh (@sreeram-venkitesh)
Thanks to the Emeritus Steering Committee Members. Your service is appreciated by the community:
- Stephen Augustus (@justaugustus), Bloomberg
- Patrick Ohly (@pohly), Intel
And thank you to all the candidates who came forward to run for election.
Get involved with the Steering Committee
This governing body, like all of Kubernetes, is open to all. You can follow along with Steering Committee meeting notes and weigh in by filing an issue or creating a PR against their repo. They have an open meeting on the first Monday at 8am PT of every month. They can also be contacted at their public mailing list [email protected].
You can see what the Steering Committee meetings are all about by watching past meetings on the YouTube Playlist.
This post was adapted from one written by the Contributor Comms Subproject. If you want to write stories about the Kubernetes community, learn more about us.
Gateway API 1.4: New Features
Ready to rock your Kubernetes networking? The Kubernetes SIG Network community presented the General Availability (GA) release of Gateway API (v1.4.0)! Released on October 6, 2025, version 1.4.0 reinforces the path for modern, expressive, and extensible service networking in Kubernetes.
Gateway API v1.4.0 brings three new features to the Standard channel (Gateway API's GA release channel):
- BackendTLSPolicy for TLS between gateways and backends
supportedFeaturesin GatewayClass status- Named rules for Routes
and introduces three new experimental features:
- Mesh resource for service mesh configuration
- Default gateways to ease configuration burden**
externalAuthfilter for HTTPRoute
Graduations to Standard Channel
Backend TLS policy
Leads: Candace Holman, Norwin Schnyder, Katarzyna Łach
GEP-1897: BackendTLSPolicy
BackendTLSPolicy is a new Gateway API type for specifying the TLS configuration of the connection from the Gateway to backend pod(s). . Prior to the introduction of BackendTLSPolicy, there was no API specification that allowed encrypted traffic on the hop from Gateway to backend.
The BackendTLSPolicy validation configuration requires a hostname. This hostname
serves two purposes. It is used as the SNI header when connecting to the backend and
for authentication, the certificate presented by the backend must match this hostname,
unless subjectAltNames is explicitly specified.
If subjectAltNames (SANs) are specified, the hostname is only used for SNI, and authentication is performed against the SANs instead. If you still need to authenticate against the hostname value in this case, you MUST add it to the subjectAltNames list.
BackendTLSPolicy validation configuration also requires either caCertificateRefs or wellKnownCACertificates.
caCertificateRefs refer to one or more (up to 8) PEM-encoded TLS certificate bundles. If there are no specific certificates to use,
then depending on your implementation, you may use wellKnownCACertificates,
set to "System" to tell the Gateway to use an implementation-specific set of trusted CA Certificates.
In this example, the BackendTLSPolicy is configured to use certificates defined in the auth-cert ConfigMap
to connect with a TLS-encrypted upstream connection where pods backing the auth service are expected to serve a
valid certificate for auth.example.com. It uses subjectAltNames with a Hostname type, but you may also use a URI type.
apiVersion: gateway.networking.k8s.io/v1
kind: BackendTLSPolicy
metadata:
name: tls-upstream-auth
spec:
targetRefs:
- kind: Service
name: auth
group: ""
sectionName: "https"
validation:
caCertificateRefs:
- group: "" # core API group
kind: ConfigMap
name: auth-cert
subjectAltNames:
- type: "Hostname"
hostname: "auth.example.com"
In this example, the BackendTLSPolicy is configured to use system certificates to connect with a TLS-encrypted backend connection where Pods backing the dev Service are expected to serve a valid certificate for dev.example.com.
apiVersion: gateway.networking.k8s.io/v1
kind: BackendTLSPolicy
metadata:
name: tls-upstream-dev
spec:
targetRefs:
- kind: Service
name: dev
group: ""
sectionName: "btls"
validation:
wellKnownCACertificates: "System"
hostname: dev.example.com
More information on the configuration of TLS in Gateway API can be found in Gateway API - TLS Configuration.
Status information about the features that an implementation supports
Leads: Lior Lieberman, Beka Modebadze
GEP-2162: Supported features in GatewayClass Status
GatewayClass status has a new field, supportedFeatures.
This addition allows implementations to declare the set of features they support. This provides a clear way for users and tools to understand the capabilities of a given GatewayClass.
This feature's name for conformance tests (and GatewayClass status reporting) is SupportedFeatures.
Implementations must populate the supportedFeatures field in the .status of the GatewayClass before the GatewayClass
is accepted, or in the same operation.
Here’s an example of a supportedFeatures published under GatewayClass' .status:
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
...
status:
conditions:
- lastTransitionTime: "2022-11-16T10:33:06Z"
message: Handled by Foo controller
observedGeneration: 1
reason: Accepted
status: "True"
type: Accepted
supportedFeatures:
- HTTPRoute
- HTTPRouteHostRewrite
- HTTPRoutePortRedirect
- HTTPRouteQueryParamMatching
Graduation of SupportedFeatures to Standard, helped improve the conformance testing process for Gateway API. The conformance test suite will now automatically run tests based on the features populated in the GatewayClass' status. This creates a strong, verifiable link between an implementation's declared capabilities and the test results, making it easier for implementers to run the correct conformance tests and for users to trust the conformance reports.
This means when the SupportedFeatures field is populated in the GatewayClass status there will be no need for additional
conformance tests flags like –suported-features, or –exempt or –all-features.
It's important to note that Mesh features are an exception to this and can be tested for conformance by using
Conformance Profiles, or by manually providing any combination of features related flags until the dedicated resource
graduates from the experimental channel.
Named rules for Routes
GEP-995: Adding a new name field to all xRouteRule types (HTTPRouteRule, GRPCRouteRule, etc.)
Leads: Guilherme Cassolato
This enhancement enables route rules to be explicitly identified and referenced across the Gateway API ecosystem. Some of the key use cases include:
- Status: Allowing status conditions to reference specific rules directly by name.
- Observability: Making it easier to identify individual rules in logs, traces, and metrics.
- Policies: Enabling policies (GEP-713) to target specific route rules via the
sectionNamefield in theirtargetRef[s]. - Tooling: Simplifying filtering and referencing of route rules in tools such as
gwctl,kubectl, and general-purpose utilities likejqandyq. - Internal configuration mapping: Facilitating the generation of internal configurations that reference route rules by name within gateway and mesh implementations.
This follows the same well-established pattern already adopted for Gateway listeners, Service ports, Pods (and containers), and many other Kubernetes resources.
While the new name field is optional (so existing resources remain valid), its use is strongly encouraged. Implementations are not expected to assign a default value, but they may enforce constraints such as immutability.
Finally, keep in mind that the name format is validated,
and other fields (such as sectionName)
may impose additional, indirect constraints.
Experimental channel changes
Enabling external Auth for HTTPRoute
Giving Gateway API the ability to enforce authentication and maybe authorization as well at the Gateway or HTTPRoute level has been a highly requested feature for a long time. (See the GEP-1494 issue for some background.)
This Gateway API release adds an Experimental filter in HTTPRoute that tells the Gateway API implementation to call out to an external service to authenticate (and, optionally, authorize) requests.
This filter is based on the Envoy ext_authz API, and allows talking to an Auth service that uses either gRPC or HTTP for its protocol.
Both methods allow the configuration of what headers to forward to the Auth service, with the HTTP protocol allowing some extra information like a prefix path.
A HTTP example might look like this (noting that this example requires the Experimental channel to be installed and an implementation that supports External Auth to actually understand the config):
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: require-auth
namespace: default
spec:
parentRefs:
- name: your-gateway-here
rules:
- matches:
- path:
type: Prefix
value: /admin
filters:
- type: ExternalAuth
externalAuth:
protocol: HTTP
backendRef:
name: auth-service
http:
# These headers are always sent for the HTTP protocol,
# but are included here for illustrative purposes
allowedHeaders:
- Host
- Method
- Path
- Content-Length
- Authorization
backendRefs:
- name: admin-backend
port: 8080
This allows the backend Auth service to use the supplied headers to make a determination about the authentication for the request.
When a request is allowed, the external Auth service will respond with a 200 HTTP response code, and optionally extra headers to be included in the request that is forwarded to the backend. When the request is denied, the Auth service will respond with a 403 HTTP response.
Since the Authorization header is used in many authentication methods, this method can be used to do Basic, Oauth, JWT, and other common authentication and authorization methods.
Mesh resource
Lead(s): Flynn
GEP-3949: Mesh-wide configuration and supported features
Gateway API v1.4.0 introduces a new experimental Mesh resource, which provides a way to configure mesh-wide settings and discover the features supported by a given mesh implementation. This resource is analogous to the Gateway resource and will initially be mainly used for conformance testing, with plans to extend its use to off-cluster Gateways in the future.
The Mesh resource is cluster-scoped and, as an experimental feature, is named XMesh and resides in the gateway.networking.x-k8s.io API group. A key field is controllerName, which specifies the mesh implementation responsible for the resource. The resource's status stanza indicates whether the mesh implementation has accepted it and lists the features the mesh supports.
One of the goals of this GEP is to avoid making it more difficult for users to adopt a mesh. To simplify adoption, mesh implementations are expected to create a default Mesh resource upon startup if one with a matching controllerName doesn't already exist. This avoids the need for manual creation of the resource to begin using a mesh.
The new XMesh API kind, within the gateway.networking.x-k8s.io/v1alpha1 API group, provides a central point for mesh configuration and feature discovery (source).
A minimal XMesh object specifies the controllerName:
apiVersion: gateway.networking.x-k8s.io/v1alpha1
kind: XMesh
metadata:
name: one-mesh-to-mesh-them-all
spec:
controllerName: one-mesh.example.com/one-mesh
The mesh implementation populates the status field to confirm it has accepted the resource and to list its supported features ( source):
status:
conditions:
- type: Accepted
status: "True"
reason: Accepted
supportedFeatures:
- name: MeshHTTPRoute
- name: OffClusterGateway
Introducing default Gateways
Lead(s): Flynn
GEP-3793: Allowing Gateways to program some routes by default.
For application developers, one common piece of feedback has been the need to explicitly name a parent Gateway for every single north-south Route. While this explicitness prevents ambiguity, it adds friction, especially for developers who just want to expose their application to the outside world without worrying about the underlying infrastructure's naming scheme. To address this, we have introduce the concept of Default Gateways.
For application developers: Just "use the default"
As an application developer, you often don't care about the specific Gateway your traffic flows through, you just want it to work. With this enhancement, you can now create a Route and simply ask it to use a default Gateway.
This is done by setting the new useDefaultGateways field in your Route's spec.
Here’s a simple HTTPRoute that uses a default Gateway:
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: my-route
spec:
useDefaultGateways: All
rules:
- backendRefs:
- name: my-service
port: 80
That's it! No more need to hunt down the correct Gateway name for your environment. Your Route is now a "defaulted Route."
For cluster operators: You're still in control
This feature doesn't take control away from cluster operators ("Chihiro"). In fact, they have explicit control over which Gateways can act as a default. A Gateway will only accept these defaulted Routes if it is configured to do so.
You can also use a ValidatingAdmissionPolicy to either require or even forbid for Routes to rely on a default Gateway.
As a cluster operator, you can designate a Gateway as a default
by setting the (new) .spec.defaultScope field:
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: my-default-gateway
namespace: default
spec:
defaultScope: All
# ... other gateway configuration
Operators can choose to have no default Gateways, or even multiple.
How it works and key details
-
To maintain a clean, GitOps-friendly workflow, a default Gateway does not modify the
spec.parentRefsof your Route. Instead, the binding is reflected in the Route'sstatusfield. You can always inspect thestatus.parentsstanza of your Route to see exactly which Gateway or Gateways have accepted it. This preserves your original intent and avoids conflicts with CD tools. -
The design explicitly supports having multiple Gateways designated as defaults within a cluster. When this happens, a defaulted Route will bind to all of them. This enables cluster operators to perform zero-downtime migrations and testing of new default Gateways.
-
You can create a single Route that handles both north-south traffic (traffic entering or leaving the cluster, via a default Gateway) and east-west/mesh traffic (traffic between services within the cluster), by explicitly referencing a Service in
parentRefs.
Default Gateways represent a significant step forward in making the Gateway API simpler and more intuitive for everyday use cases, bridging the gap between the flexibility needed by operators and the simplicity desired by developers.
Configuring client certificate validation
Lead(s): Arko Dasgupta, Katarzyna Łach
GEP-91: Address connection coalescing security issue
This release brings updates for configuring client certificate validation, addressing a critical security vulnerability related to connection reuse. HTTP connection coalescing is a web performance optimization that allows a client to reuse an existing TLS connection for requests to different domains. While this reduces the overhead of establishing new connections, it introduces a security risk in the context of API gateways. The ability to reuse a single TLS connection across multiple Listeners brings the need to introduce shared client certificate configuration in order to avoid unauthorized access.
Why SNI-based mTLS is not the answer
One might think that using Server Name Indication (SNI) to differentiate between Listeners would solve this problem. However, TLS SNI is not a reliable mechanism for enforcing security policies in a connection coalescing scenario. A client could use a single TLS connection for multiple peer connections, as long as they are all covered by the same certificate. This means that a client could establish a connection by indicating one peer identity (using SNI), and then reuse that connection to access a different virtual host that is listening on the same IP address and port. That reuse, which is controlled by client side heuristics, could bypass mutual TLS policies that were specific to the second listener configuration.
Here's an example to help explain it:
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: wildcard-tls-gateway
spec:
gatewayClassName: example
listeners:
- name: foo-https
protocol: HTTPS
port: 443
hostname: foo.example.com
tls:
certificateRefs:
- group: "" # core API group
kind: Secret
name: foo-example-com-cert # SAN: foo.example.com
- name: wildcard-https
protocol: HTTPS
port: 443
hostname: "*.example.com"
tls:
certificateRefs:
- group: "" # core API group
kind: Secret
name: wildcard-example-com-cert # SAN: *.example.com
I have configured a Gateway with two listeners, both having overlapping hostnames.
My intention is for the foo-http listener to be accessible only by clients presenting the foo-example-com-cert certificate.
In contrast, the wildcard-https listener should allow access to a broader audience using any certificate valid for the *.example.com domain.
Consider a scenario where a client initially connects to foo.example.com. The server requests and successfully validates the
foo-example-com-cert certificate, establishing the connection. Subsequently, the same client wishes to access other sites within this domain,
such as bar.example.com, which is handled by the wildcard-https listener. Due to connection reuse,
clients can access wildcard-https backends without an additional TLS handshake on the existing connection.
This process functions as expected.
However, a critical security vulnerability arises when the order of access is reversed.
If a client first connects to bar.example.com and presents a valid bar.example.com certificate, the connection is successfully established.
If this client then attempts to access foo.example.com, the existing connection's client certificate will not be re-validated.
This allows the client to bypass the specific certificate requirement for the foo backend, leading to a serious security breach.
The solution: per-port TLS configuration
The updated Gateway API gains a tls field in the .spec of a Gateway, that allows you to define a default client certificate
validation configuration for all Listeners, and then if needed override it on a per-port basis. This provides a flexible and
powerful way to manage your TLS policies.
Here’s a look at the updated API definitions (shown as Go source code):
// GatewaySpec defines the desired state of Gateway.
type GatewaySpec struct {
...
// GatewayTLSConfig specifies frontend tls configuration for gateway.
TLS *GatewayTLSConfig `json:"tls,omitempty"`
}
// GatewayTLSConfig specifies frontend tls configuration for gateway.
type GatewayTLSConfig struct {
// Default specifies the default client certificate validation configuration
Default TLSConfig `json:"default"`
// PerPort specifies tls configuration assigned per port.
PerPort []TLSPortConfig `json:"perPort,omitempty"`
}
// TLSPortConfig describes a TLS configuration for a specific port.
type TLSPortConfig struct {
// The Port indicates the Port Number to which the TLS configuration will be applied.
Port PortNumber `json:"port"`
// TLS store the configuration that will be applied to all Listeners handling
// HTTPS traffic and matching given port.
TLS TLSConfig `json:"tls"`
}
Breaking changes
Standard GRPCRoute - .spec field required (technicality)
The promotion of GRPCRoute to Standard introduces a minor but technically breaking change regarding the presence of the top-level .spec field.
As part of achieving Standard status, the Gateway API has tightened the OpenAPI schema validation within the GRPCRoute
CustomResourceDefinition (CRD)
to explicitly ensure the spec field is required for all GRPCRoute resources.
This change enforces stricter conformance to Kubernetes object standards and enhances the resource's stability and predictability.
While it is highly unlikely that users were attempting to define a GRPCRoute without any specification, any existing automation
or manifests that might have relied on a relaxed interpretation allowing a completely absent spec field will now fail validation
and must be updated to include the .spec field, even if empty.
Experimental CORS support in HTTPRoute - breaking change for allowCredentials field
The Gateway API subproject has introduced a breaking change to the Experimental CORS support in HTTPRoute, concerning the allowCredentials field
within the CORS policy.
This field's definition has been strictly aligned with the upstream CORS specification, which dictates that the corresponding
Access-Control-Allow-Credentials header must represent a Boolean value.
Previously, the implementation might have been overly permissive, potentially accepting non-standard or string representations such as
true due to relaxed schema validation.
Users who were configuring CORS rules must now review their manifests and ensure the value for allowCredentials
strictly conforms to the new, more restrictive schema.
Any existing HTTPRoute definitions that do not adhere to this stricter validation will now be rejected by the API server,
requiring a configuration update to maintain functionality.
Improving the development and usage experience
As part of this release, we have improved some of the developer experience workflow:
- Added Kube API Linter to the CI/CD pipelines, reducing the burden of API reviewers and also reducing the amount of common mistakes.
- Improving the execution time of CRD tests with the usage of
envtest.
Additionally, as part of the effort to improve Gateway API usage experience, some efforts were made to remove some ambiguities and some old tech-debts from our documentation website:
- The API reference is now explicit when a field is
experimental. - The GEP (GatewayAPI Enhancement Proposal) navigation bar is automatically generated, reflecting the real status of the enhancements.
Try it out
Unlike other Kubernetes APIs, you don't need to upgrade to the latest version of Kubernetes to get the latest version of Gateway API. As long as you're running Kubernetes 1.26 or later, you'll be able to get up and running with this version of Gateway API.
To try out the API, follow the Getting Started Guide.
As of this writing, seven implementations are already conformant with Gateway API v1.4.0. In alphabetical order:
- Agent Gateway (with kgateway)
- Airlock Microgateway
- Envoy Gateway
- GKE Gateway
- Istio
- kgateway
- Traefik Proxy
Get involved
Wondering when a feature will be added? There are lots of opportunities to get involved and help define the future of Kubernetes routing APIs for both ingress and service mesh.
- Check out the user guides to see what use-cases can be addressed.
- Try out one of the existing Gateway controllers.
- Or join us in the community and help us build the future of Gateway API together!
The maintainers would like to thank everyone who's contributed to Gateway API, whether in the form of commits to the repo, discussion, ideas, or general support. We could never have made this kind of progress without the support of this dedicated and active community.
Related Kubernetes blog articles
- Gateway API v1.3.0: Advancements in Request Mirroring, CORS, Gateway Merging, and Retry Budgets (June 2025)
- Gateway API v1.2: WebSockets, Timeouts, Retries, and More (November 2024)
- Gateway API v1.1: Service mesh, GRPCRoute, and a whole lot more (May 2024)
- New Experimental Features in Gateway API v1.0 (November 2023)
- Gateway API v1.0: GA Release (October 2023)
7 Common Kubernetes Pitfalls (and How I Learned to Avoid Them)
It’s no secret that Kubernetes can be both powerful and frustrating at times. When I first started dabbling with container orchestration, I made more than my fair share of mistakes enough to compile a whole list of pitfalls. In this post, I want to walk through seven big gotchas I’ve encountered (or seen others run into) and share some tips on how to avoid them. Whether you’re just kicking the tires on Kubernetes or already managing production clusters, I hope these insights help you steer clear of a little extra stress.
1. Skipping resource requests and limits
The pitfall: Not specifying CPU and memory requirements in Pod specifications. This typically happens because Kubernetes does not require these fields, and workloads can often start and run without them—making the omission easy to overlook in early configurations or during rapid deployment cycles.
Context: In Kubernetes, resource requests and limits are critical for efficient cluster management. Resource requests ensure that the scheduler reserves the appropriate amount of CPU and memory for each pod, guaranteeing that it has the necessary resources to operate. Resource limits cap the amount of CPU and memory a pod can use, preventing any single pod from consuming excessive resources and potentially starving other pods. When resource requests and limits are not set:
- Resource Starvation: Pods may get insufficient resources, leading to degraded performance or failures. This is because Kubernetes schedules pods based on these requests. Without them, the scheduler might place too many pods on a single node, leading to resource contention and performance bottlenecks.
- Resource Hoarding: Conversely, without limits, a pod might consume more than its fair share of resources, impacting the performance and stability of other pods on the same node. This can lead to issues such as other pods getting evicted or killed by the Out-Of-Memory (OOM) killer due to lack of available memory.
How to avoid it:
- Start with modest
requests(for example100mCPU,128Mimemory) and see how your app behaves. - Monitor real-world usage and refine your values; the HorizontalPodAutoscaler can help automate scaling based on metrics.
- Keep an eye on
kubectl top podsor your logging/monitoring tool to confirm you’re not over- or under-provisioning.
My reality check: Early on, I never thought about memory limits. Things seemed fine on my local cluster. Then, on a larger environment, Pods got OOMKilled left and right. Lesson learned. For detailed instructions on configuring resource requests and limits for your containers, please refer to Assign Memory Resources to Containers and Pods (part of the official Kubernetes documentation).
2. Underestimating liveness and readiness probes
The pitfall: Deploying containers without explicitly defining how Kubernetes should check their health or readiness. This tends to happen because Kubernetes will consider a container “running” as long as the process inside hasn’t exited. Without additional signals, Kubernetes assumes the workload is functioning—even if the application inside is unresponsive, initializing, or stuck.
Context:
Liveness, readiness, and startup probes are mechanisms Kubernetes uses to monitor container health and availability.
- Liveness probes determine if the application is still alive. If a liveness check fails, the container is restarted.
- Readiness probes control whether a container is ready to serve traffic. Until the readiness probe passes, the container is removed from Service endpoints.
- Startup probes help distinguish between long startup times and actual failures.
How to avoid it:
- Add a simple HTTP
livenessProbeto check a health endpoint (for example/healthz) so Kubernetes can restart a hung container. - Use a
readinessProbeto ensure traffic doesn’t reach your app until it’s warmed up. - Keep probes simple. Overly complex checks can create false alarms and unnecessary restarts.
My reality check: I once forgot a readiness probe for a web service that took a while to load. Users hit it prematurely, got weird timeouts, and I spent hours scratching my head. A 3-line readiness probe would have saved the day.
For comprehensive instructions on configuring liveness, readiness, and startup probes for containers, please refer to Configure Liveness, Readiness and Startup Probes in the official Kubernetes documentation.
3. “We’ll just look at container logs” (famous last words)
The pitfall: Relying solely on container logs retrieved via kubectl logs. This often happens because the command is quick and convenient, and in many setups, logs appear accessible during development or early troubleshooting. However, kubectl logs only retrieves logs from currently running or recently terminated containers, and those logs are stored on the node’s local disk. As soon as the container is deleted, evicted, or the node is restarted, the log files may be rotated out or permanently lost.
How to avoid it:
- Centralize logs using CNCF tools like Fluentd or Fluent Bit to aggregate output from all Pods.
- Adopt OpenTelemetry for a unified view of logs, metrics, and (if needed) traces. This lets you spot correlations between infrastructure events and app-level behavior.
- Pair logs with Prometheus metrics to track cluster-level data alongside application logs. If you need distributed tracing, consider CNCF projects like Jaeger.
My reality check: The first time I lost Pod logs to a quick restart, I realized how flimsy “kubectl logs” can be on its own. Since then, I’ve set up a proper pipeline for every cluster to avoid missing vital clues.
4. Treating dev and prod exactly the same
The pitfall: Deploying the same Kubernetes manifests with identical settings across development, staging, and production environments. This often occurs when teams aim for consistency and reuse, but overlook that environment-specific factors—such as traffic patterns, resource availability, scaling needs, or access control—can differ significantly. Without customization, configurations optimized for one environment may cause instability, poor performance, or security gaps in another.
How to avoid it:
- Use environment overlays or kustomize to maintain a shared base while customizing resource requests, replicas, or config for each environment.
- Extract environment-specific configuration into ConfigMaps and / or Secrets. You can use a specialized tool such as Sealed Secrets to manage confidential data.
- Plan for scale in production. Your dev cluster can probably get away with minimal CPU/memory, but prod might need significantly more.
My reality check: One time, I scaled up replicaCount from 2 to 10 in a tiny dev environment just to “test.” I promptly ran out of resources and spent half a day cleaning up the aftermath. Oops.
5. Leaving old stuff floating around
The pitfall: Leaving unused or outdated resources—such as Deployments, Services, ConfigMaps, or PersistentVolumeClaims—running in the cluster. This often happens because Kubernetes does not automatically remove resources unless explicitly instructed, and there is no built-in mechanism to track ownership or expiration. Over time, these forgotten objects can accumulate, consuming cluster resources, increasing cloud costs, and creating operational confusion, especially when stale Services or LoadBalancers continue to route traffic.
How to avoid it:
- Label everything with a purpose or owner label. That way, you can easily query resources you no longer need.
- Regularly audit your cluster: run
kubectl get all -n <namespace>to see what’s actually running, and confirm it’s all legit. - Adopt Kubernetes’ Garbage Collection: K8s docs show how to remove dependent objects automatically.
- Leverage policy automation: Tools like Kyverno can automatically delete or block stale resources after a certain period, or enforce lifecycle policies so you don’t have to remember every single cleanup step.
My reality check: After a hackathon, I forgot to tear down a “test-svc” pinned to an external load balancer. Three weeks later, I realized I’d been paying for that load balancer the entire time. Facepalm.
6. Diving too deep into networking too soon
The pitfall: Introducing advanced networking solutions—such as service meshes, custom CNI plugins, or multi-cluster communication—before fully understanding Kubernetes' native networking primitives. This commonly occurs when teams implement features like traffic routing, observability, or mTLS using external tools without first mastering how core Kubernetes networking works: including Pod-to-Pod communication, ClusterIP Services, DNS resolution, and basic ingress traffic handling. As a result, network-related issues become harder to troubleshoot, especially when overlays introduce additional abstractions and failure points.
How to avoid it:
- Start small: a Deployment, a Service, and a basic ingress controller such as one based on NGINX (e.g., Ingress-NGINX).
- Make sure you understand how traffic flows within the cluster, how service discovery works, and how DNS is configured.
- Only move to a full-blown mesh or advanced CNI features when you actually need them, complex networking adds overhead.
My reality check: I tried Istio on a small internal app once, then spent more time debugging Istio itself than the actual app. Eventually, I stepped back, removed Istio, and everything worked fine.
7. Going too light on security and RBAC
The pitfall: Deploying workloads with insecure configurations, such as running containers as the root user, using the latest image tag, disabling security contexts, or assigning overly broad RBAC roles like cluster-admin. These practices persist because Kubernetes does not enforce strict security defaults out of the box, and the platform is designed to be flexible rather than opinionated. Without explicit security policies in place, clusters can remain exposed to risks like container escape, unauthorized privilege escalation, or accidental production changes due to unpinned images.
How to avoid it:
- Use RBAC to define roles and permissions within Kubernetes. While RBAC is the default and most widely supported authorization mechanism, Kubernetes also allows the use of alternative authorizers. For more advanced or external policy needs, consider solutions like OPA Gatekeeper (based on Rego), Kyverno, or custom webhooks using policy languages such as CEL or Cedar.
- Pin images to specific versions (no more
:latest!). This helps you know what’s actually deployed. - Look into Pod Security Admission (or other solutions like Kyverno) to enforce non-root containers, read-only filesystems, etc.
My reality check: I never had a huge security breach, but I’ve heard plenty of cautionary tales. If you don’t tighten things up, it’s only a matter of time before something goes wrong.
Final thoughts
Kubernetes is amazing, but it’s not psychic, it won’t magically do the right thing if you don’t tell it what you need. By keeping these pitfalls in mind, you’ll avoid a lot of headaches and wasted time. Mistakes happen (trust me, I’ve made my share), but each one is a chance to learn more about how Kubernetes truly works under the hood. If you’re curious to dive deeper, the official docs and the community Slack are excellent next steps. And of course, feel free to share your own horror stories or success tips, because at the end of the day, we’re all in this cloud native adventure together.
Happy Shipping!
Spotlight on Policy Working Group
(Note: The Policy Working Group has completed its mission and is no longer active. This article reflects its work, accomplishments, and insights into how a working group operates.)
In the complex world of Kubernetes, policies play a crucial role in managing and securing clusters. But have you ever wondered how these policies are developed, implemented, and standardized across the Kubernetes ecosystem? To answer that, let's take a look back at the work of the Policy Working Group.
The Policy Working Group was dedicated to a critical mission: providing an overall architecture that encompasses both current policy-related implementations and future policy proposals in Kubernetes. Their goal was both ambitious and essential: to develop a universal policy architecture that benefits developers and end-users alike.
Through collaborative methods, this working group strove to bring clarity and consistency to the often complex world of Kubernetes policies. By focusing on both existing implementations and future proposals, they ensured that the policy landscape in Kubernetes remains coherent and accessible as the technology evolves.
This blog post dives deeper into the work of the Policy Working Group, guided by insights from its former co-chairs:
Interviewed by Arujjwal Negi.
These co-chairs explained what the Policy Working Group was all about.
Introduction
Hello, thank you for the time! Let’s start with some introductions, could you tell us a bit about yourself, your role, and how you got involved in Kubernetes?
Jim Bugwadia: My name is Jim Bugwadia, and I am a co-founder and the CEO at Nirmata which provides solutions that automate security and compliance for cloud-native workloads. At Nirmata, we have been working with Kubernetes since it started in 2014. We initially built a Kubernetes policy engine in our commercial platform and later donated it to CNCF as the Kyverno project. I joined the CNCF Kubernetes Policy Working Group to help build and standardize various aspects of policy management for Kubernetes and later became a co-chair.
Andy Suderman: My name is Andy Suderman and I am the CTO of Fairwinds, a managed Kubernetes-as-a-Service provider. I began working with Kubernetes in 2016 building a web conferencing platform. I am an author and/or maintainer of several Kubernetes-related open-source projects such as Goldilocks, Pluto, and Polaris. Polaris is a JSON-schema-based policy engine, which started Fairwinds' journey into the policy space and my involvement in the Policy Working Group.
Poonam Lamba: My name is Poonam Lamba, and I currently work as a Product Manager for Google Kubernetes Engine (GKE) at Google. My journey with Kubernetes began back in 2017 when I was building an SRE platform for a large enterprise, using a private cloud built on Kubernetes. Intrigued by its potential to revolutionize the way we deployed and managed applications at the time, I dove headfirst into learning everything I could about it. Since then, I've had the opportunity to build the policy and compliance products for GKE. I lead and contribute to GKE CIS benchmarks. I am involved with the Gatekeeper project as well as I have contributed to Policy-WG for over 2 years and served as a co-chair for the group.
Responses to the following questions represent an amalgamation of insights from the former co-chairs.
About Working Groups
One thing even I am not aware of is the difference between a working group and a SIG. Can you help us understand what a working group is and how it is different from a SIG?
Unlike SIGs, working groups are temporary and focused on tackling specific, cross-cutting issues or projects that may involve multiple SIGs. Their lifespan is defined, and they disband once they've achieved their objective. Generally, working groups don't own code or have long-term responsibility for managing a particular area of the Kubernetes project.
(To know more about SIGs, visit the list of Special Interest Groups)
You mentioned that Working Groups involve multiple SIGS. What SIGS was the Policy WG closely involved with, and how did you coordinate with them?
The group collaborated closely with Kubernetes SIG Auth throughout our existence, and more recently, the group also worked with SIG Security since its formation. Our collaboration occurred in a few ways. We provided periodic updates during the SIG meetings to keep them informed of our progress and activities. Additionally, we utilize other community forums to maintain open lines of communication and ensured our work aligned with the broader Kubernetes ecosystem. This collaborative approach helped the group stay coordinated with related efforts across the Kubernetes community.
Policy WG
Why was the Policy Working Group created?
To enable a broad set of use cases, we recognize that Kubernetes is powered by a highly declarative, fine-grained, and extensible configuration management system. We've observed that a Kubernetes configuration manifest may have different portions that are important to various stakeholders. For example, some parts may be crucial for developers, while others might be of particular interest to security teams or address operational concerns. Given this complexity, we believe that policies governing the usage of these intricate configurations are essential for success with Kubernetes.
Our Policy Working Group was created specifically to research the standardization of policy definitions and related artifacts. We saw a need to bring consistency and clarity to how policies are defined and implemented across the Kubernetes ecosystem, given the diverse requirements and stakeholders involved in Kubernetes deployments.
Can you give me an idea of the work you did in the group?
We worked on several Kubernetes policy-related projects. Our initiatives included:
- We worked on a Kubernetes Enhancement Proposal (KEP) for the Kubernetes Policy Reports API. This aims to standardize how policy reports are generated and consumed within the Kubernetes ecosystem.
- We conducted a CNCF survey to better understand policy usage in the Kubernetes space. This helped gauge the practices and needs across the community at the time.
- We wrote a paper that will guide users in achieving PCI-DSS compliance for containers. This is intended to help organizations meet important security standards in their Kubernetes environments.
- We also worked on a paper highlighting how shifting security down can benefit organizations. This focuses on the advantages of implementing security measures earlier in the development and deployment process.
Can you tell us what were the main objectives of the Policy Working Group and some of your key accomplishments?
The charter of the Policy WG was to help standardize policy management for Kubernetes and educate the community on best practices.
To accomplish this we updated the Kubernetes documentation (Policies | Kubernetes), produced several whitepapers (Kubernetes Policy Management, Kubernetes GRC), and created the Policy Reports API (API reference) which standardizes reporting across various tools. Several popular tools such as Falco, Trivy, Kyverno, kube-bench, and others support the Policy Report API. A major milestone for the Policy WG was promoting the Policy Reports API to a SIG-level API or finding it a stable home.
Beyond that, as ValidatingAdmissionPolicy and MutatingAdmissionPolicy approached GA in Kubernetes, a key goal of the WG was to guide and educate the community on the tradeoffs and appropriate usage patterns for these built-in API objects and other CNCF policy management solutions like OPA/Gatekeeper and Kyverno.
Challenges
What were some of the major challenges that the Policy Working Group worked on?
During our work in the Policy Working Group, we encountered several challenges:
-
One of the main issues we faced was finding time to consistently contribute. Given that many of us have other professional commitments, it can be difficult to dedicate regular time to the working group's initiatives.
-
Another challenge we experienced was related to our consensus-driven model. While this approach ensures that all voices are heard, it can sometimes lead to slower decision-making processes. We valued thorough discussion and agreement, but this can occasionally delay progress on our projects.
-
We've also encountered occasional differences of opinion among group members. These situations require careful navigation to ensure that we maintain a collaborative and productive environment while addressing diverse viewpoints.
-
Lastly, we've noticed that newcomers to the group may find it difficult to contribute effectively without consistent attendance at our meetings. The complex nature of our work often requires ongoing context, which can be challenging for those who aren't able to participate regularly.
Can you tell me more about those challenges? How did you discover each one? What has the impact been? What were some strategies you used to address them?
There are no easy answers, but having more contributors and maintainers greatly helps! Overall the CNCF community is great to work with and is very welcoming to beginners. So, if folks out there are hesitating to get involved, I highly encourage them to attend a WG or SIG meeting and just listen in.
It often takes a few meetings to fully understand the discussions, so don't feel discouraged if you don't grasp everything right away. We made a point to emphasize this and encouraged new members to review documentation as a starting point for getting involved.
Additionally, differences of opinion were valued and encouraged within the Policy-WG. We adhered to the CNCF core values and resolve disagreements by maintaining respect for one another. We also strove to timebox our decisions and assign clear responsibilities to keep things moving forward.
This is where our discussion about the Policy Working Group ends. The working group, and especially the people who took part in this article, hope this gave you some insights into the group's aims and workings. You can get more info about Working Groups here.