You are here

CNCF Projects

Announcing the AI Gateway Working Group

Kubernetes Blog - Mon, 03/09/2026 - 14:00

The community around Kubernetes includes a number of Special Interest Groups (SIGs) and Working Groups (WGs) facilitating discussions on important topics between interested contributors. Today, we're excited to announce the formation of the AI Gateway Working Group, a new initiative focused on developing standards and best practices for networking infrastructure that supports AI workloads in Kubernetes environments.

What is an AI Gateway?

In a Kubernetes context, an AI Gateway refers to network gateway infrastructure (including proxy servers, load-balancers, etc.) that generally implements the Gateway API specification with enhanced capabilities for AI workloads. Rather than defining a distinct product category, AI Gateways describe infrastructure designed to enforce policy on AI traffic, including:

  • Token-based rate limiting for AI APIs.
  • Fine-grained access controls for inference APIs.
  • Payload inspection enabling intelligent routing, caching, and guardrails.
  • Support for AI-specific protocols and routing patterns.

Working group charter and mission

The AI Gateway Working Group operates under a clear charter with the mission to develop proposals for Kubernetes Special Interest Groups (SIGs) and their sub-projects. Its primary goals include:

  • Standards Development: Create declarative APIs, standards, and guidance for AI workload networking in Kubernetes.
  • Community Collaboration: Foster discussions and build consensus around best practices for AI infrastructure.
  • Extensible Architecture: Ensure composability, pluggability, and ordered processing for AI-specific gateway extensions.
  • Standards-Based Approach: Build on established networking foundations, layering AI-specific capabilities on top of proven standards.

Active proposals

WG AI Gateway currently has several active proposals that address key challenges in AI workload networking:

Payload Processing

The payload processing proposal addresses the critical need for AI workloads to inspect and transform full HTTP request and response payloads. This enables:

AI Inference Security

  • Guard against malicious prompts and prompt injection attacks.
  • Content filtering for AI responses.
  • Signature-based detection and anomaly detection for AI traffic.

AI Inference Optimization

  • Semantic routing based on request content.
  • Intelligent caching to reduce inference costs and improve response times.
  • RAG (Retrieval-Augmented Generation) system integration for context enhancement.

The proposal defines standards for declarative payload processor configuration, ordered processing pipelines, and configurable failure modes - all essential for production AI workload deployments.

Egress gateways

Modern AI applications increasingly depend on external inference services, whether for specialized models, failover scenarios, or cost optimization. The egress gateways proposal aims to define standards for securely routing traffic outside the cluster. Key features include:

External AI Service Integration

  • Secure access to cloud-based AI services (OpenAI, Vertex AI, Bedrock, etc.).
  • Managed authentication and token injection for third-party AI APIs.
  • Regional compliance and failover capabilities.

Advanced Traffic Management

  • Backend resource definitions for external FQDNs and services.
  • TLS policy management and certificate authority control.
  • Cross-cluster routing for centralized AI infrastructure.

User Stories We're Addressing

  • Platform operators providing managed access to external AI services.
  • Developers requiring inference failover across multiple cloud providers.
  • Compliance engineers enforcing regional restrictions on AI traffic.
  • Organizations centralizing AI workloads on dedicated clusters.

Upcoming events

KubeCon + CloudNativeCon Europe 2026, Amsterdam

AI Gateway working group members will be presenting at KubeCon + CloudNativeCon Europe in Amsterdam, discussing the problems at the intersection of AI and networking, including the working group's active proposals, as well as the intersection of AI gateways with Model Context Protocol (MCP) and agent networking patterns.
This session will showcase how AI Gateway working group proposals enable the infrastructure needed for next-generation AI deployments and communication patterns.
The session will also include the initial designs, early prototypes, and emerging directions shaping the WG’s roadmap.
For more details see our session here:

Get involved

The AI Gateway Working Group represents the Kubernetes community's commitment to standardizing AI workload networking. As AI becomes increasingly integral to modern applications, we need robust, standardized infrastructure that can support the unique requirements of inference workloads while maintaining the security, observability, and reliability standards that Kubernetes users expect.
Our proposals are currently in active development, with implementations beginning across various gateway projects. We're working closely with SIG Network on Gateway API enhancements and collaborating with the broader cloud-native community to ensure our standards meet real-world production needs.

Whether you're a gateway implementer, platform operator, AI application developer, or simply interested in the intersection of Kubernetes and AI, we'd love your input. The working group follows an open contribution model - you can review our proposals, join our weekly meetings, or start discussions on our GitHub repository. To learn more:

The future of AI infrastructure in Kubernetes is being built today, join up and learn how you can contribute and help shape the future of AI-aware gateway capabilities in Kubernetes.

Categories: CNCF Projects, Kubernetes

CoreDNS-1.14.2 Release

CoreDNS Blog - Thu, 03/05/2026 - 19:00
This release adds the new proxyproto plugin to support Proxy Protocol and preserve client IPs behind load balancers. It also includes enhancements such as improved DNS logging metadata and stronger randomness for loop detection (CVE-2026-26018), along with several bug fixes including TLS+IPv6 forwarding, improved CNAME handling and rewriting, allowing jitter disabling, prevention of an ACL bypass (CVE-2026-26017), and a Kubernetes plugin crash fix. In addition, the release updates the build to Go 1.
Categories: CNCF Projects

Uncached I/O in Prometheus

Prometheus Blog - Wed, 03/04/2026 - 19:00

Do you find yourself constantly looking up the difference between container_memory_usage_bytes, container_memory_working_set_bytes, and container_memory_rss? Pick the wrong one and your memory limits lie to you, your benchmarks mislead you, and your container gets OOMKilled.

You're not alone. There is even a 9-year-old Kubernetes issue that captures the frustration of users.

The explanation is simple: RAM is not used in just one way. One of the easiest things to miss is the page cache semantics. For some containers, memory taken by page caching can make up most of the reported usage, even though that memory is largely reclaimable, creating surprising differences between those metrics.

NOTE: The feature discussed here currently only supports Linux.

Prometheus writes a lot of data to disk. It is, after all, a database. But not every write benefits from sitting in the page cache. Compaction writes are the clearest example: once a block is written, only a fraction of that data is likely to be queried again soon, and since there is no way to predict which fraction, caching it all offers little return. The use-uncached-io feature flag was built to address exactly this.

Bypassing the cache for those writes reduces Prometheus's page cache footprint, making its memory usage more predictable and easier to reason about. It also relieves pressure on that shared cache, lowering the risk of evicting hot data that queries and other reads actually depend on. A potential bonus is reduced CPU overhead from cache allocations and evictions. The hard constraint throughout was to avoid any measurable regression in CPU or disk I/O.

The flag was introduced in Prometheus v3.5.0 and currently only supports Linux. Under the hood, it uses direct I/O, which requires proper filesystem support and a kernel v2.4.10 or newer, though you should be fine, as that version shipped nearly 25 years ago.

If direct I/O helps here, why was it not done earlier, and why is it not used everywhere it would help? Because direct I/O comes with strict alignment requirements. Unlike buffered I/O, you cannot simply write any chunk of memory to any position in a file. The file offset, the memory buffer address, and the transfer size must all be aligned to the logical sector size of the underlying storage device, typically 512 or 4096 bytes.

To satisfy those constraints, a bufio.Writer-like writer, directIOWriter, was implemented. On Linux kernels v6.1 or newer, Prometheus retrieves the exact alignment values via statx; on older kernels, conservative defaults are used.

The directIOWriter currently covers chunk writes during compaction only, but that alone accounts for a substantial portion of Prometheus's I/O. The results are tangible: benchmarks show a 20–50% reduction in page cache usage, as measured by container_memory_cache.

benchmark1

benchmark2

The work is not done yet, and contributions are welcome. Here are a few areas that could help move the feature closer to General Availability:

Covering more write paths

Direct I/O is currently limited to chunk writes during compaction. Index files and WAL writes are natural next candidates, although they would require some additional work.

Building more confidence around directIOWriter

All existing TSDB tests can be run against the directIOWriter using a dedicated build tag: go test --tags=forcedirectio ./tsdb/. More tests covering edge cases for the writer itself would be welcome, and there is even an idea of formally verifying that it never violates alignment requirements.

Experimenting with RWF_DONTCACHE

Introduced in Linux kernel v6.14, RWF_DONTCACHE enables uncached buffered I/O, where data still goes through the page cache but the corresponding pages are dropped afterwards. It would be worth benchmarking whether this delivers similar benefits without direct I/O's alignment constraints.

Support beyond Linux

Support is currently Linux-only. Contributions to extend it to other operating systems are welcome.

For more details, see the proposal and the PR that introduced the feature.

Categories: CNCF Projects

Scaling organizational structure with Meshery’s expanding ecosystem

CNCF Blog Projects Category - Wed, 03/04/2026 - 07:00

An image of the green Meshery logo and 'Meshery Extensions' title, with a CNCF logo in the bottom right hand corner of the image.

As a high velocity project and one of the fastest-growing projects in the CNCF ecosystem, Meshery’s increasing scale and community contributions necessitates this recognition, which requires a revision to its governance and organizational structure that better aligns with the scale of its growing complexity and community contributions. To best serve its expansive ecosystem, Meshery maintainers have opted to partition the numerous GitHub repositories into two distinct organizations: github.com/meshery for the core platform and github.com/meshery-extensions for extensions and integrations.

This post explains the rationale behind the shift, outlining the proposed governance structure, setting expectations around support, and describing project mechanics, drawing inspiration from other successful CNCF projects.

Rationale for Repository Partitioning

The decision to partition repositories aims to improve project structure, manageability, scalability, and community engagement. 

Project architecture

Meshery is a highly extensible, self-service management platform. Every feature is developed with extensibility in mind, as is evident by the ubiquity of extension points throughout Meshery’s architecture.

Modularity and focus

Separating the core platform from extensions allows the Meshery core team to concentrate on maintaining and enhancing the primary platform, which includes critical components like Meshery Operator and MeshSync. Extensions, such as adapters for specific cloud native technologies, can be developed and maintained independently by community contributors or specialized teams. This modularity ensures that the core platform remains robust and focused.

Project scalability

With support for over 300 integrations and counting, managing everything under one GitHub organization has become impractical. A separate organization for extensions simplifies permission management, contribution processes, and release cycles, making the ecosystem more scalable.

  • Community ownership and maintenance: Projects within meshery-extensions are generally initiated, developed, and maintained by members of the community, rather than the core maintainers. This allows the ecosystem to scale beyond what the core team can directly support.
  • Clearer support expectations: Distinguishing between the core and extensions makes it clear that projects in meshery-extensions have different maintenance levels, release cadences, and support guarantees compared to the core components. This clarifies that users are relying on community support for these specific integrations.

Community engagement

By providing a dedicated space for extensions, Meshery encourages community contributions, as developers can create and maintain extensions without needing deep involvement in the core platform’s development.  With this approach, meshery-extensions fosters a vibrant ecosystem around Meshery by providing a designated, community-centric space for extensions, integrations, and tooling, keeping the core project focused and manageable while enabling broad community participation.

  • Incubation and experimentation: The separate organization acts as an incubator for new ideas, providers, or tooling related to Meshery. Projects can start here and, if they gain significant traction and stability, will be considered for migration or closer integration with the core project.
  • Ecosystem growth: Part of Meshery’s power lies in its ability to manage any infrastructure via Providers, Models, Adapters, and its other extension points. Since there are countless APIs and services, meshery-extensions serves as the place where the community can build and share Providers for less common cloud services, specific SaaS platforms, or even internal company APIs, without needing official endorsement or maintenance from the core maintainers.

Governance Structure

The new structure allows for different governance models and maintainer structures for community projects compared to the core project. Meshery can adopt a governance model that balances control over the core platform with flexibility for extensions, drawing from its existing governance and the Kubernetes’ SIG model.

Core Platform (github.com/meshery)

  • Governance: Governed by the core Meshery maintainers, as outlined in the project’s governance document. Roles include contributors, organization members, and maintainers, with clear processes for becoming a maintainer (e.g., nomination, voting by existing maintainers).
  • Responsibilities: Maintainers review, approve, and merge pull requests, manage releases, and ensure the platform’s stability and alignment with CNCF standards.
  • Decision-making: Decisions are made through consensus among maintainers, with regular meetings and transparent communication via Slack and community forums.

Extensions (github.com/meshery-extensions)

  • Governance: Each extension may have its own maintainers and a lighter governance structure to encourage innovation. A review process by the core team ensures extensions meet quality and compatibility standards.
  • Maintainer selection: Extension maintainers can be nominated by community members or self-nominated, with approval from the core team based on contribution history and technical expertise.
  • Autonomy: Extension teams have autonomy over their development processes, provided they adhere to Meshery’s code of conduct and integration guidelines.

Oversight and Coordination

  • Steering committee: A steering committee, composed of core maintainers and representatives from active extension teams, oversees cross-organization alignment, resolves conflicts, and approves new extensions.
  • Transparency: Both organizations maintain open communication with public meeting minutes, discussion forums, and regular updates to the community.
AspectCore PlatformExtensionsGovernanceStructured, led by core maintainersFlexible, per-extension maintainersMaintainer SelectionNomination, 2/3rds majority voteNomination, core team approvalDecision-MakingConsensus among maintainersExtension team consensus, core oversightCommunicationPublic meetings, Slack, forumsPublic issues, Slack, optional meetings

Delineated support expectations

Support expectations differ between the core platform and extensions to reflect their distinct roles and maintenance models.

Core platform

  • Full support: The core team provides regular updates, bug fixes, and feature enhancements, ensuring stability for critical components like Meshery Operator and MeshSync.
  • Documentation: Comprehensive guides, such as installation instructions and CLI usage, are maintained (Meshery Documentation).
  • Community support: Active engagement through Slack, forums, and weekly newcomer meetings to support users and contributors.

Extensions

  • Variable support: Core team-maintained extensions receive robust support, while community-maintained ones may have limited support.
  • Clear labeling: Documentation should indicate the support level (e.g., “Official” vs. “Community”) for each extension.
  • Integration support: The core platform provides stable APIs and extension points, ensuring compatibility, with guidelines for developers (Meshery Extensions).

Project mechanics

Managing two organizations involves distinct development, testing, and integration processes to ensure a cohesive ecosystem.

Development process

  • Platform: Follows a structured release cycle with stable and edge channels. Changes undergo rigorous review to sustain stability. Notify platform extenders and system integrators of upcoming changes in the underlying framework to ensure time is afforded to maintain compatibility.
  • Extensions: Operate on independent release cycles, allowing rapid iteration. Developers use Meshery’s extension points to integrate with the core platform, following contribution guidelines.

Integration testing

  • Compatibility testing: Extensions are tested against multiple core platform versions to deliver compatibility, using guidance for verifying compatibility between core platform and extensions. 
  • Automated pipelines: GitHub Actions automate testing and snapshot generation, as seen in extensions like Helm Kanvas Snapshot.
  • Performance testing: Meshery’s performance management features can be used to benchmark extensions, ensuring they meet efficiency standards.

Documentation and resources

  • Comprehensive guides: Documentation covers core platform usage, extension development, and integration (Meshery Docs). The Newcomers’ Guide and MeshMates program aid onboarding (Meshery Community).
  • Catalog and templates: Meshery’s catalog of design templates includes extension configurations, and promoting best practices (Meshery Catalog).
  • Community resources: Weekly meetings, Slack channels, and the community handbook provide ongoing support.

Reflections on other projects

Meshery’s expansion strategy draws inspiration from successful models within the Cloud Native Computing Foundation (CNCF), like Argo, Crossplane, and Kubernetes. These projects demonstrate effective approaches to decentralized governance and focused development through the separation of core and community-contributed components.

Meshery aims to emulate Crossplane’s model of maintaining a clear distinction between its core platform (github.com/crossplane) and community contributions (github.com/crossplane-contrib). This separation allows third-party developers to extend Crossplane’s capabilities without affecting the core’s stability, a model that supports Meshery’s approach to fostering innovation while maintaining a reliable core.

Similarly, Meshery Extension teams operate with autonomy over their development processes, provided they adhere to Meshery’s core component frameworks and integration guidelines. This mirrors Argo’s model (github.com/argoproj-labs), where projects function independently but align with broader project goals.

Kubernetes provides a robust model for decentralized governance through its use of github.com/kubernetes for core components and github.com/kubernetes-sigs for Special Interest Groups (SIGs). Each SIG acts as a mini-community with its own charter, leadership, and processes, all while aligning with overarching project goals, as outlined in theKubernetes Governance. Meshery’s extension organization can adopt a similar structure, enabling extension teams to operate autonomously within defined guidelines.

Meshery umbrella expands

See the current list of repositories under each organization: meshery org repos and meshery-extensions org repos.

By partitioning repositories into github.com/meshery andgithub.com/meshery-extensions, Meshery is taking a strategic step towards the overarching goal of improved modularity, scalability, and community engagement.

By adopting a governance structure that balances control and flexibility, delineating clear support expectations, and implementing robust project mechanics, Meshery can effectively manage its growing ecosystem. Drawing inspiration from graduated projects, Meshery is poised to remain a leading CNCF project—empowering collaborative cloud native management.

Categories: CNCF Projects

Introducing the UX Research Working Group

Prometheus Blog - Tue, 03/03/2026 - 19:00

Prometheus has always prioritized solving complex technical challenges to deliver a reliable, performant open-source monitoring system. Over time, however, users have expressed a variety of experience-related pain points. Those pain points range from onboarding and configuration to documentation, mental models, and interoperability across the ecosystem.

At PromCon 2025, a user research study was presented that highlighted several of these issues. Although the central area of investigation involved Prometheus and OpenTelemetry workflows, the broader takeaway was clear: Prometheus would benefit from a dedicated, ongoing effort to understand user needs and improve the overall user experience.

Recognizing this, the Prometheus team established a Working Group focused on improving user experience through design and user research. This group is meant to support all areas of Prometheus by bringing structured research, user insights, and usability perspectives into the community's development and decision-making processes.

How we can help Prometheus maintainers

Building something where the user needs are unclear? Maybe you're looking at two competing solutions and you'd like to understand the user tradeoffs alongside the technical ones.

That's where we can be of help.

The UX Working Group will partner with you to conduct user research or provide feedback on your plans for user outreach. That could include:

  • User research reports and summaries
  • User journeys, personas, wireframes, prototypes, and other UX artifacts
  • Recommendations for improving usability, onboarding, interoperability, and documentation
  • Prioritized lists of user pain points
  • Suggestions for community discussions or decision-making topics

To get started, tell us what you're trying to do, and we'll work with you to determine what type and scope of research is most appropriate.

How we can help Prometheus end users

We want to hear from you! Let us know if you're interested in participating in a research study and we'll contact you when we're working on one that's a good fit. Having an issue with the Prometheus user experience? We can help you open an issue and direct it to the appropriate community members.

Interested in helping?

New contributors to the working group are always welcome! Get in touch and let us know what you'd like to work on.

Where to find us

Drop us a message in Slack, join a meeting, or raise an issue in GitHub.

Categories: CNCF Projects

Before You Migrate: Five Surprising Ingress-NGINX Behaviors You Need to Know

Kubernetes Blog - Fri, 02/27/2026 - 10:30

As announced November 2025, Kubernetes will retire Ingress-NGINX in March 2026. Despite its widespread usage, Ingress-NGINX is full of surprising defaults and side effects that are probably present in your cluster today. This blog highlights these behaviors so that you can migrate away safely and make a conscious decision about which behaviors to keep. This post also compares Ingress-NGINX with Gateway API and shows you how to preserve Ingress-NGINX behavior in Gateway API. The recurring risk pattern in every section is the same: a seemingly correct translation can still cause outages if it does not consider Ingress-NGINX's quirks.

I'm going to assume that you, the reader, have some familiarity with Ingress-NGINX and the Ingress API. Most examples use httpbin as the backend.

Also, note that Ingress-NGINX and NGINX Ingress are two separate Ingress controllers. Ingress-NGINX is an Ingress controller maintained and governed by the Kubernetes community that is retiring March 2026. NGINX Ingress is an Ingress controller by F5. Both use NGINX as the dataplane, but are otherwise unrelated. From now on, this blog post only discusses Ingress-NGINX.

1. Regex matches are prefix-based and case insensitive

Suppose that you wanted to route all requests with a path consisting of only three uppercase letters to the httpbin service. You might create the following Ingress with the nginx.ingress.kubernetes.io/use-regex: "true" annotation and the regex pattern of /[A-Z]{3}.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
 name: regex-match-ingress
 annotations:
 nginx.ingress.kubernetes.io/use-regex: "true"
spec:
 ingressClassName: nginx
 rules:
 - host: regex-match.example.com
 http:
 paths:
 - path: "/[A-Z]{3}"
 pathType: ImplementationSpecific
 backend:
 service:
 name: httpbin
 port:
 number: 8000

However, because regex matches are prefix and case insensitive, Ingress-NGINX routes any request with a path that starts with any three letters to httpbin:

curl -sS -H "Host: regex-match.example.com" http://<your-ingress-ip>/uuid

The output is similar to:

{
 "uuid": "e55ef929-25a0-49e9-9175-1b6e87f40af7"
}

Note: The /uuid endpoint of httpbin returns a random UUID. A UUID in the response body means that the request was successfully routed to httpbin.

With Gateway API, you can use an HTTP path match with a type of RegularExpression for regular expression path matching. RegularExpression matches are implementation specific, so check with your Gateway API implementation to verify the semantics of RegularExpression matching. Popular Envoy-based Gateway API implementations such as Istio1, Envoy Gateway, and Kgateway do a full case-sensitive match.

Thus, if you are unaware that Ingress-NGINX patterns are prefix and case-insensitive, and, unbeknownst to you, clients or applications send traffic to /uuid (or /uuid/some/other/path), you might create the following HTTP route.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
 name: regex-match-route
spec:
 hostnames:
 - regex-match.example.com
 parentRefs:
 - name: <your gateway>  # Change this depending on your use case
 rules:
 - matches:
 - path:
 type: RegularExpression
 value: "/[A-Z]{3}"
 backendRefs:
 - name: httpbin
 port: 8000

However, if your Gateway API implementation does full case-sensitive matches, the above HTTP route would not match a request with a path of /uuid. The above HTTP route would thus cause an outage because requests that Ingress-NGINX routed to httpbin would fail with a 404 Not Found at the gateway.

To preserve the case-insensitive regex matching, you can use the following HTTP route.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
 name: regex-match-route
spec:
 hostnames:
 - regex-match.example.com
 parentRefs:
 - name: <your gateway>  # Change this depending on your use case
 rules:
 - matches:
 - path:
 type: RegularExpression
 value: "/[a-zA-Z]{3}.*"
 backendRefs:
 - name: httpbin
 port: 8000

Alternatively, the aforementioned proxies support the (?i) flag to indicate case insensitive matches. Using the flag, the pattern could be (?i)/[a-z]{3}.*.

2. The nginx.ingress.kubernetes.io/use-regex applies to all paths of a host across all (Ingress-NGINX) Ingresses

Now, suppose that you have an Ingress with the nginx.ingress.kubernetes.io/use-regex: "true" annotation, but you want to route requests with a path of exactly /headers to httpbin. Unfortunately, you made a typo and set the path to /Header instead of /headers.

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
 name: regex-match-ingress
 annotations:
 nginx.ingress.kubernetes.io/use-regex: "true"
spec:
 ingressClassName: nginx
 rules:
 - host: regex-match.example.com
 http:
 paths:
 - path: "<some regex pattern>"
 pathType: ImplementationSpecific
 backend:
 service:
 name: <your backend>
 port:
 number: 8000
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
 name: regex-match-ingress-other
spec:
 ingressClassName: nginx
 rules:
 - host: regex-match.example.com
 http:
 paths:
 - path: "/Header" # typo here, should be /headers
 pathType: Exact
 backend:
 service:
 name: httpbin
 port:
 number: 8000

Most would expect a request to /headers to respond with a 404 Not Found, since /headers does not match the Exact path of /Header. However, because the regex-match-ingress Ingress has the nginx.ingress.kubernetes.io/use-regex: "true" annotation and the regex-match.example.com host, all paths with the regex-match.example.com host are treated as regular expressions across all (Ingress-NGINX) Ingresses. Since regex patterns are case-insensitive prefix matches, /headers matches the /Header pattern and Ingress-NGINX routes such requests to httpbin. Running the command

curl -sS -H "Host: regex-match.example.com" http://<your-ingress-ip>/headers

the output looks like:

{
 "headers": {
 ...
 }
}

Note: The /headers endpoint of httpbin returns the request headers. The fact that the response contains the request headers in the body means that the request was successfully routed to httpbin.

Gateway API does not silently convert or interpret Exact and Prefix matches as regex patterns. So if you converted the above Ingresses into the following HTTP route and preserved the typo and match types, requests to /headers will respond with a 404 Not Found instead of a 200 OK.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
 name: regex-match-route
spec:
 hostnames:
 - regex-match.example.com
 rules:
 ...
 - matches:
 - path:
 type: Exact
 value: "/Header"
 backendRefs:
 - name: httpbin
 port: 8000

To keep the case-insensitive prefix matching, you can change

 - matches:
 - path:
 type: Exact
 value: "/Header"

to

 - matches:
 - path:
 type: RegularExpression
 value: "(?i)/Header"

Or even better, you could fix the typo and change the match to

 - matches:
 - path:
 type: Exact
 value: "/headers"

3. Rewrite target implies regex

In this case, suppose you want to rewrite the path of requests with a path of /ip to /uuid before routing them to httpbin, and as in Section 2, you want to route requests with the path of exactly /headers to httpbin. However, you accidentally make a typo and set the path to /IP instead of /ip and /Header instead of /headers.

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
 name: rewrite-target-ingress
 annotations:
 nginx.ingress.kubernetes.io/rewrite-target: "/uuid"
spec:
 ingressClassName: nginx
 rules:
 - host: rewrite-target.example.com
 http:
 paths:
 - path: "/IP"
 pathType: Exact
 backend:
 service:
 name: httpbin
 port:
 number: 8000
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
 name: rewrite-target-ingress-other
spec:
 ingressClassName: nginx
 rules:
 - host: rewrite-target.example.com
 http:
 paths:
 - path: "/Header"
 pathType: Exact
 backend:
 service:
 name: httpbin
 port:
 number: 8000

The nginx.ingress.kubernetes.io/rewrite-target: "/uuid" annotation causes requests that match paths in the rewrite-target-ingress Ingress to have their paths rewritten to /uuid before being routed to the backend.

Even though no Ingress has the nginx.ingress.kubernetes.io/use-regex: "true" annotation, the presence of the nginx.ingress.kubernetes.io/rewrite-target annotation in the rewrite-target-ingress Ingress causes all paths with the rewrite-target.example.com host to be treated as regex patterns. In other words, the nginx.ingress.kubernetes.io/rewrite-target silently adds the nginx.ingress.kubernetes.io/use-regex: "true" annotation, along with all the side effects discussed above.

For example, a request to /ip has its path rewritten to /uuid because /ip matches the case-insensitive prefix pattern of /IP in the rewrite-target-ingress Ingress. After running the command

curl -sS -H "Host: rewrite-target.example.com" http://<your-ingress-ip>/ip

the output is similar to:

{
 "uuid": "12a0def9-1adg-2943-adcd-1234aadfgc67"
}

Like in the nginx.ingress.kubernetes.io/use-regex example, Ingress-NGINX treats paths of other ingresses with the rewrite-target.example.com host as case-insensitive prefix patterns. Running the command

curl -sS -H "Host: rewrite-target.example.com" http://<your-ingress-ip>/headers

gives an output that looks like

{
 "headers": {
 ...
 }
}

You can configure path rewrites in Gateway API with the HTTP URL rewrite filter which does not silently convert your Exact and Prefix matches into regex patterns. However, if you are unaware of the side effects of the nginx.ingress.kubernetes.io/rewrite-target annotation and do not realize that /Header and /IP are both typos, you might create the following HTTP route.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
 name: rewrite-target-route
spec:
 hostnames:
 - rewrite-target.example.com
 parentRefs:
 - name: <your-gateway>
 rules:
 - matches:
 - path:
 type: Exact
 value: "/IP"
 filters:
 - type: URLRewrite
 urlRewrite:
 path:
 type: ReplaceFullPath
 replaceFullPath: /uuid
 backendRefs:
 - name: httpbin
 port: 8000
 - matches:
 - path:
 # This is an exact match, irrespective of other rules
 type: Exact
 value: "/Header"
 backendRefs:
 - name: httpbin
 port: 8000

As with Section 2, because /IP is now an Exact match type in your HTTP route, requests to /ip will respond with a 404 Not Found instead of a 200 OK. Similarly, requests to /headers will also respond with a 404 Not Found instead of a 200 OK. Thus, this HTTP route will break applications and clients that rely on the /ip and /headers routes.

To fix this, you can change the matches in the HTTP route to be regex matches, and change the path patterns to be case-insensitive prefix matches, as follows.

 - matches:
 - path:
 type: RegularExpression
 value: "(?i)/IP.*"
...
 - matches:
 - path:
 type: RegularExpression
 value: "(?i)/Header.*"

Or, you can keep the Exact match type and fix the typos.

4. Requests missing a trailing slash are redirected to the same path with a trailing slash

Consider the following Ingress:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
 name: trailing-slash-ingress
spec:
 ingressClassName: nginx
 rules:
 - host: trailing-slash.example.com
 http:
 paths:
 - path: "/my-path/"
 pathType: Exact
 backend:
 service:
 name: <your-backend>
 port:
 number: 8000

You might expect Ingress-NGINX to respond to /my-path with a 404 Not Found since the /my-path does not exactly match the Exact path of /my-path/. However, Ingress-NGINX redirects the request to /my-path/ with a 301 Moved Permanently because the only difference between /my-path and /my-path/ is a trailing slash.

curl -isS -H "Host: trailing-slash.example.com" http://<your-ingress-ip>/my-path

The output looks like:

HTTP/1.1 301 Moved Permanently
...
Location: http://trailing-slash.example.com/my-path/
...

The same applies if you change the pathType to Prefix. However, the redirect does not happen if the path is a regex pattern.

Conformant Gateway API implementations do not silently configure any kind of redirects. If clients or downstream services depend on this redirect, a migration to Gateway API that does not explicitly configure request redirects will cause an outage because requests to /my-path will now respond with a 404 Not Found instead of a 301 Moved Permanently. You can explicitly configure redirects using the HTTP request redirect filter as follows:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
 name: trailing-slash-route
spec:
 hostnames:
 - trailing-slash.example.com
 parentRefs:
 - name: <your-gateway>
 rules:
 - matches:
 - path:
 type: Exact
 value: "/my-path"
 filters:
 requestRedirect:
 statusCode: 301
 path:
 type: ReplaceFullPath
 replaceFullPath: /my-path/
 - matches:
 - path:
 type: Exact # or Prefix
 value: "/my-path/"
 backendRefs:
 - name: <your-backend>
 port: 8000

5. Ingress-NGINX normalizes URLs

URL normalization is the process of converting a URL into a canonical form before matching it against Ingress rules and routing it. The specifics of URL normalization are defined in RFC 3986 Section 6.2, but some examples are

  • removing path segments that are just a .: my/./path -> my/path
  • having a .. path segment remove the previous segment: my/../path -> /path
  • deduplicating consecutive slashes in a path: my//path -> my/path

Ingress-NGINX normalizes URLs before matching them against Ingress rules. For example, consider the following Ingress:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
 name: path-normalization-ingress
spec:
 ingressClassName: nginx
 rules:
 - host: path-normalization.example.com
 http:
 paths:
 - path: "/uuid"
 pathType: Exact
 backend:
 service:
 name: httpbin
 port:
 number: 8000

Ingress-NGINX normalizes the path of the following requests to /uuid. Now that the request matches the Exact path of /uuid, Ingress-NGINX responds with either a 200 OK response or a 301 Moved Permanently to /uuid.

For the following commands

curl -sS -H "Host: path-normalization.example.com" http://<your-ingress-ip>/uuid
curl -sS -H "Host: path-normalization.example.com" http://<your-ingress-ip>/ip/abc/../../uuid
curl -sSi -H "Host: path-normalization.example.com" http://<your-ingress-ip>////uuid

the outputs are similar to

{
 "uuid": "29c77dfe-73ec-4449-b70a-ef328ea9dbce"
}
{
 "uuid": "d20d92e8-af57-4014-80ba-cf21c0c4ffae"
}
HTTP/1.1 301 Moved Permanently
...
Location: /uuid
...

Your backends might rely on the Ingress/Gateway API implementation to normalize URLs. That said, most Gateway API implementations will have some path normalization enabled by default. For example, Istio, Envoy Gateway, and Kgateway all normalize . and .. segments out of the box. For more details, check the documentation for each Gateway API implementation that you use.

Conclusion

As we all race to respond to the Ingress-NGINX retirement, I hope this blog post instills some confidence that you can migrate safely and effectively despite all the intricacies of Ingress-NGINX.

SIG Network has also been working on supporting the most common Ingress-NGINX annotations (and some of these unexpected behaviors) in Ingress2Gateway to help you translate Ingress-NGINX configuration into Gateway API, and offer alternatives to unsupported behavior.

SIG Network released Gateway API 1.5 earlier today (27th February 2026), which graduates features such as ListenerSet (that allow app developers to better manage TLS certificates), and the HTTPRoute CORS filter that allows CORS configuration.

  1. You can use Istio purely as Gateway API controller with no other service mesh features. ↩︎

Categories: CNCF Projects, Kubernetes

Kubernetes WG Serving concludes following successful advancement of AI inference support

CNCF Blog Projects Category - Thu, 02/26/2026 - 08:30

The Kubernetes Working Group (WG) Serving was created to support development of the AI inference stack on Kubernetes. The goal of this working group was to ensure that Kubernetes is an orchestration platform of choice for inference workloads. This goal has been accomplished, and the working group is now being disbanded.

WG Serving formed workstreams to collect requirements from various model servers, hardware providers, and inference vendors. This work resulted in a common understanding of inference workload specifics and trends and laid the foundation for improvements across many SIGs in Kubernetes.

The working group oversaw several key evolutions related to load balancing and workloads. The inference gateway was adopted as a request scheduler. Multiple groups have worked to standardize AI gateway functionality, and early inference gateway participants went on to seed agent networking work in SIG Network.

The use cases and problem statements gathered by the working group informed the design of AIBrix.

Many of the unresolved problems in distributed inference — especially benchmarking and recommended best practices — have been picked up by the llm-d project, which hybridizes the infrastructure and ML ecosystems and is better able to steer model server co-evolution.

In particular, llm-d and AIBrix represent more appropriate forums for driving requirements to Kubernetes SIGs than this working group. llm-d’s goal is to provide well-lit paths for achieving state-of-the-art inference and aims to provide recommendations that can compose into existing inference user platforms. AIBrix provides a complete platform solution for cost-efficient LLM inference.

WG Serving helped with Kubernetes AI Conformance requirements. The llm-d project is leveraging multiple components from the profile and making recommendations to end users consistent with Kubernetes direction (including Kueue, inference gateway, LWS, DRA, and related efforts). Widely adopted patterns and solutions are expected to go into the conformance program.

All efforts currently running inside WG Serving can be migrated to other working groups or directly to SIGs. Requirements will be discussed in SIGs and in the llm-d community. Specifically:

  • Autoscaling-related questions — mostly related to fast bootstrap — will be discussed in SIG Node or SIG Scheduling.
  • Multi-host, multi-node work can continue as part of SIG Apps (for example, for the LWS project), and DRA requirements will be discussed in WG Device Management.
  • Orchestration topics will be covered by SIG Scheduling and SIG Node.
  • Requirements for DRA will be discussed in WG Device Management.

The Gateway API Inference Extension project is already sponsored by SIG Network and will remain there. The Serving Catalog work can be moved to the Inference Perf project. Originally it was designed for a larger scope, but it has been used mostly for inference performance.

The Inference Perf project is sponsored by SIG Scalability, and no change of ownership is needed.

CNCF thanks all contributors who participated in WG Serving and helped advance Kubernetes as a platform for AI inference workloads.

Categories: CNCF Projects

Exposing Spin apps on SpinKube with GatewayAPI

CNCF Blog Projects Category - Thu, 02/26/2026 - 07:00

The Gateway API isn’t just an “Ingress v2”, it’s an entirely revamped approach for exposing services from within Kubernetes and eliminates the need of encoding routing capabilities into vendor-specific, unstructured annotations. In this post, we will explore how to expose WebAssembly applications built using the CNCF Spin framework and served by SpinKube using the Gateway API.

What is SpinKube

SpinKube, a CNCF sandbox project, is an open-source stack for running serverless WebAssembly applications (Spin apps) on top of Kubernetes. Although SpinKube leverages Kubernetes primitives like Deployments, Services and Pods, there are no containers involved for running your serverless Spin apps at all. Instead, it leverages a containerd-shim implementation and spawns processes on the underlying Kubernetes worker nodes for running Spin apps.

You can learn more about SpinKube and find detailed instructions on how to deploy SpinKube to your Kubernetes cluster at https://spinkube.dev.

What is Gateway API

The Gateway API is the modern, role-oriented successor to the legacy Ingress resource, designed to provide a more expressive and extensible networking interface for Kubernetes. Unlike Ingress, which often relies on a messy sprawl of vendor-specific annotations to handle complex logic, the Gateway API breaks traffic management into atomic resources —GatewayClass, Gateway, and routes (like HTTPRoute or GRPCRoute).

This separation allows infrastructure admins to manage the entry points while giving developers control over how their specific services are exposed, enabling native support for advanced traffic patterns like canary rollouts, header-based routing, and traffic mirroring without the need for bespoke configurations.

To dive deeper into the technical specifications and resource hierarchy, head over to the official Gateway API documentation.

Provisioning a Kubernetes cluster, installing SpinKube and implementing Spin apps are considered beyond the scope of this article. However, you can head over to https://github.com/akamai-developers/exposing-spin-apps-with-gatway-api – a repository containing all source code, along with the necessary instructions for setting up a LKE cluster with SpinKube.

To follow the article’s demo, you’ll deploy the required artifacts to your Kubernetes cluster. Make sure you have the following tools installed:

Build and deploy the Spin apps to Kubernetes

Let’s start by compiling the source code of our sample Spin apps down to WebAssembly. Doing so is as easy as executing the spin build command from within each application folder:

# Build the greeter application
pushd apps/greeter
spin build

 Building component greeter with `cargo build --target wasm32-wasip1 --release`
     Finished `release` profile [optimized] target(s) in 0.21s
 Finished building all Spin components

popd

# Build the prime_numbers application
pushd apps/prime-numbers
spin build

  Building component prime-numbers with `cargo build --target wasm32-wasip1 --release`
    Finished `release` profile [optimized] target(s) in 0.18s
  Finished building all Spin components
popd

Once the application has been compiled, we use the spin registry push to distribute it as OCI artifact. (If your OCI compliant registry requires authentication, you must login first. Use the spin registry login to authenticate before trying to push).

Tip: For testing purposes, we’ll use ttl.sh an anonymous and ephemeral OCI compliant registry, which allows us to store our applications for 24 hours by simply specifying the TTL as a tag.

# specify variables
greeter_app_artifact=ttl.sh/spin-greeter:24h
primenumbers_app_artifact=ttl.sh/spin-prime-numbers:24h

# optional: Authenticate against registry
oci_reg_server=
oci_reg_user=
oci_reg_password=
spin registry login $oci_reg_server -u $oci_reg_user -p $oci_reg_password

# distribute the Spin applications
pushd apps/greeter
spin registry push $greeter_app_artifact --build
popd

pushd apps/prime-numbers
spin registry push $primenumbers_app_artifact --build
popd

Finally, we use the spin kube scaffold command for generating the necessary Kubernetes manifests.

Tip: Spin does not have any opinions on how you deploy resources to your Kubernetes cluster. You can either use kubectl, create a Helm chart and deploy it using the helm CLI, or describe the desired state and deploy it with GitOps.

For the sake of this article, we’ll simply pipe the generated manifest to kubectl apply. The actual manifests are shown here for illustration purposes:

# Deploy the Spin applications to Kubernets
spin kube scaffold --from $greeter_app_artifact | kubectl apply -f -
spin kube scaffold --from $primenumbers_app_artifact | kubectl apply -f -
apiVersion: core.spinkube.dev/v1alpha1
kind: SpinApp
metadata:
  name: spin-greeter
spec:
  image: "ttl.sh/spin-greeter:24h"
  executor: containerd-shim-spin
  replicas: 2
---
apiVersion: core.spinkube.dev/v1alpha1
kind: SpinApp
metadata:
  name: spin-prime-numbers
spec:
  image: "ttl.sh/spin-prime-numbers:24h"
  executor: containerd-shim-spin
  replicas: 2

Obviously, there are additional knobs you can turn when executing spin kube scaffold, I highly encourage you to checkout the documentation for the command by providing the --help flag.

Testing the Spin app

We use traditional port-forwarding provided by kubectl to verify that both Spin applications runs as expected:

kubectl port-forward svc/spin-greeter 8080:80

Sent a GET request to the application using curl:

curl -i localhost:8080/hello/Akamai%20Developers

HTTP/1.1 200 OK
content-type: text/plain
transfer-encoding: chunked
date: Mon, 19 Jan 2026 13:55:34 GMT

Hello, Akamai Developers!

Next, let’s test the second Spin application:

kubectl port-forward svc/spin-prime-numbers 8080:80



Again, use curl to invoke one of the endpoints exposed by the Spin app:

curl -i localhost:8080/above/42

HTTP/1.1 200 OK
transfer-encoding: chunked
date: Mon, 19 Jan 2026 17:05:02 GMT

Next prime number above 42 is 43


Now that both apps are working, you can terminate port-forwarding again (`CTRL+C) and dive into exposing both Spin apps.

Installing Gateway API CRDs and  Controller

To use the Gateway API, we must install the corresponding Gateway API resources (CRDs) on our cluster along with a Gateway API Controller.

There are several controllers available that implement the Gateway API. You can find a list of available Gateway API controllers at https://gateway-api.sigs.k8s.io/implementations/. We’ll use NGINX Gateway Fabric for now.

To install Gateway API resources run:

kubectl kustomize "https://github.com/nginx/nginx-gateway-fabric/config/crd/gateway-api/standard?ref=v2.3.0" | kubectl apply -f -

To install NGINX Gateway Fabric run:

helm install ngf oci://ghcr.io/nginx/charts/nginx-gateway-fabric --create-namespace -n nginx-gateway


Creating cluster-specific Gateway API resources

With the Gateway API controller installed, we will first deploy a Gateway to our cluster. Think of the Gateway as an entry point into your Kubernetes cluster, which could be shared across multiple applications. We’ll now create the spinkube Gateway, which will front our two Spin applications that are already running in the default namespace.

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
	name: spinkube
	namespace: default
spec:
	gatewayClassName: nginx
	listeners:
	- protocol: HTTP
	  port: 8080
	  name: http
	  allowedRoutes:
	  	namespaces:
	  		from: Same

Once you’ve deployed the Gateway, you should find a new service being provisioned to the default namespace called spinkube-nginx of type LoadBalancer once the cloud controller has acquired a public IP address, you should find it as part of the output as well.

kubectl get services

NAME            TYPE         EXTERNAL-IP
spinkube-nginx  LoadBalancer 172.238.61.25


Note down the external IP address of the spinkube-nginx service, we’ll use it in a few minutes to send requests to our Spin applications from outside of the cluster!

Creating application-specific Gateway API Resources

As we have deployed two different Spin applications to our Kubernetes cluster, we’ll also create two instances of HTTPRoute and link them to the Gateway we created in the previous section.

Tip: As managing external DNS is beyond the scope of this article, we’ll use simple PathPrefix based routing in combination with a Rewrite filter to route inbound requests to the desired Spin applications.

Create the following HTTPRoute resources in the default namespace:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: greeter
  namespace: default
spec:
  parentRefs:
  - name: spinkube
  rules:
  - backendRefs:
    - name: spin-greeter
      port: 80
    filters:
    - type: URLRewrite
      urlRewrite:
        path:
          replacePrefixMatch: /
          type: ReplacePrefixMatch
    matches:
    - path:
        type: PathPrefix
        value: /greeter
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: prime-numbers
  namespace: default
spec:
  parentRefs:
  - name: spinkube
  rules:
  - backendRefs:
    - name: spin-prime-numbers
      port: 80
    filters:
    - type: URLRewrite
      urlRewrite:
        path:
          replacePrefixMatch: /
          type: ReplacePrefixMatch
    matches:
    - path:
        type: PathPrefix
        value: /prime-numbers

Accessing the Spin apps

Having all Kubernetes resources in place, it’s time for a final test. We discovered the public IP address associated with our Gateway earlier in this post. Let’s use curl again for sending requests to both Spin application:

# Send request to the greeter app
curl -i http:///<your_gateway_ip>:8080/greet/hello/Akamai%20Developers

HTTP/1.1 200 OK
Server: nginx
Date: Mon, 19 Jan 2026 16:37:22 GMT
Content-Type: text/plain
Transfer-Encoding: chunked
Connection: keep-alive

Hello, Akamai Developers!


# Send request to the prime-numbers app
curl -i http://<your_gateway_ip>:8080/prime-numbers/above/999

HTTP/1.1 200 OK
Server: nginx
Date: Mon, 19 Jan 2026 16:37:50 GMT
Transfer-Encoding: chunked
Connection: keep-alive

Next prime number above 999 is 1009


As you can see, our requests get routed to the desired Spin application because of the path prefix (either greeter or prime-numbers).

Conclusion

The Kubernetes Gateway API streamlines how we expose services from within a Kubernetes cluster and allows precise separation of concerns. Cloud infrastructure and cluster operators create and manage resources that could be shared across multiple applications like the Gateway, while application developers provide application (or service) specific resources such as an HTTPRoute.

Especially when running tens or hundreds of different serverless applications on top of SpinKube it’s crucial to have robust and reliable routing in place to ensure applications are accessible from outside of the cluster. The Gateway API for Kubernetes makes managing these a breeze.

Contributors from Akamai collaborate on SpinKube development to deliver this runtime across its global cloud and edge. Additional information is available.at akamai.com.







Categories: CNCF Projects

Deep Dive: How linkerd-destination works in the Linkerd Service Mesh

Linkerd Blog - Wed, 02/25/2026 - 19:00

This blog post was originally published on Bezaleel Silva’s Medium blog.

Recently, in our daily operations, we took a deep dive into the inner workings of linkerd-destination, one of the most critical components of the Linkerd control plane.

The motivation was simple: as our cluster grew and traffic increased, the question shifted from “Does Linkerd work?” to “How exactly does it react when everything changes at once?”. Frequent deployments, production scaling, security policies being applied — and at the center of all this, the destination service.

Categories: CNCF Projects

Making Harbor production-ready: Essential considerations for deployment

CNCF Blog Projects Category - Tue, 02/24/2026 - 07:00

Harbor is an open-source container registry that secures artifacts with policies and role-based access control, ensuring images are scanned for vulnerabilities and signed as trusted. To learn more about Harbor and how to deploy it on a Virtual Machine (VM) and in Kubernetes (K8s), refer to parts 1 and 2 of the series.

Flow chart showcasing the Harbor container registry from development team through to K8s Cluster

While deploying Harbor is straightforward, making it production-ready requires careful consideration of several key aspects. This blog outlines critical factors to ensure your Harbor instance is robust, secure, and scalable for production environments.

For this blog, we will focus on Harbor deployed on Kubernetes via Helm as our base and provide suggestions for this specific deployment.

1. High Availability (HA) and scalability

For a production environment, single points of failure are unacceptable, especially for an image registry that will act as a central repository for storing and pulling images and artifacts for development and production applications. Thus, implementing high availability for Harbor is crucial and involves several key considerations:

  • Deploy with an Ingress: Configure a Kubernetes Service of type Ingress controller (e.g. Traefik) in front of your Harbor instances to distribute incoming traffic efficiently and provide a unified entry point along with cert-manager for certificate management. You can specify this in your values.yaml file under:
expose:
  type: ingress
  tls:
    enabled: true
    certSource: secret 
  ingress:
    hosts:
      core: harbor.yourdomain.com
    annotations:
      # Specify your ingress class
      kubernetes.io/ingress.class: traefik
      # Reference your ClusterIssuer (e.g., self-signed or internal CA)
      cert-manager.io/cluster-issuer: "harbor-cluster-issuer"

To locate your values.yaml file, refer to the previous blog.

  • Utilize multiple Harbor instances: Increase the replica count for critical Harbor components (e.g., core, jobservice, portal, registry, trivy) in your values.yaml to ensure redundancy.
core:
  replicas: 3
jobservice:
  replicas: 3
portal:
  replicas: 3
registry:
  replicas: 3
trivy:
  replicas: 3

# While not strictly for the HA of the registry itself, consider increasing exporter replicas for robust monitoring availability
exporter:
  replicas: 3

# Optionally, if using Ingress, consider increasing the Nginx replicas for improving Ingress availability
nginx:
  replicas: 3

Configure shared storage: For persistent data, configure Kubernetes StorageClasses and PersistentVolumes to use shared storage solutions like vSAN or a distributed file system. Specify these in your values.yaml under:

persistence:
  enabled: true
  resourcePolicy: "keep"
  persistentVolumeClaim:
    registry:
      #If left empty, the kubernetes cluster default storage class will be used
      storageClass: "your-storage-class"
     jobservice:
       storageClass: "your-storage-class"
     database:
       storageClass: "your-storage-class"
    redis:
      storageClass: "your-storage-class"
    trivy:
      storageClass: "your-storage-class"
  • Enable database HA (PostgreSQL): While Harbor comes with a built-in PostgreSQL database, it is not recommended for production use as it: 
  1. Lack of high availability (HA): The default internal PostgreSQL setup within the Harbor Helm chart is typically a single instance. This creates a single point of failure. If that database pod goes down, your entire Harbor instance will be unavailable.
  1. Limited scalability: An embedded database is not designed for independent scaling. If your Harbor usage grows, you might hit database performance bottlenecks that are difficult to address without disrupting Harbor itself.
  1. Complex lifecycle management: Managing backups, point-in-time recovery, patching, and upgrades for a stateful database directly within an application’s Helm chart can be significantly more complex and error-prone than with dedicated database solutions.

Thus, it is recommended to deploy a highly available PostgreSQL cluster within Kubernetes (e.g., using a Helm chart for Patroni or CloudNativePG) or leverage a managed database service outside the cluster. Configure Harbor to connect to this HA database by updating the values.yaml:

database:
  type: "external"
  external:
    host: "192.168.0.1"
    port: "5432"
    username: "user"
    password: "password"
    coreDatabase: "registry"
    # If using an existing secret, the key must be "password"
    existingSecret: ""
    # "disable" - No SSL
    # "require" - Always SSL (skip verification)
    # "verify-ca" - Always SSL (verify that the certificate presented by the
    # server was signed by a trusted CA)
    # "verify-full" - Always SSL (verify that the certification presented by the
    # server was signed by a trusted CA and the server host name matches the one
    # in the certificate)
    sslmode: "verify-full"

Implement Redis HA: Deploy a highly available Redis cluster in Kubernetes (e.g., using a Helm chart for Redis Sentinel or Redis Cluster) or utilize a managed Redis service. Configure Harbor to connect to this HA Redis instance by updating redis.type and connection details in values.yaml.

redis:
  type: external
  external:
    addr: "192.168.0.2:6397"
    sentinelMasterSet: ""
    tlsOptions:
      enable: true
    username: ""
    password: ""

2. Security best practices

Security is paramount for any production system, especially a container registry.

​​Enable TLS/SSL: Always enable TLS/SSL for all Harbor components.

expose:
  tls:
    enabled: true
    certSource: auto # change to manual if using cert-manager
    auto:
      commonName: ""
internalTLS:
  enabled: true
  strong_ssl_ciphers: true
  certSource: "auto"
  core:
    secretName: ""
  jobService:
    secretName: ""
  registry:
    secretName: ""
  portal:
    secretName: ""
  trivy:
    secretName: ""


Configure authentication and authorization: Leverage Harbor’s supported Authentication and Authorization mechanisms for managing access to Harbor resources. After Harbor deployment, integrate Harbor with enterprise identity providers like LDAP or OIDC by following the Harbor configuration guides: Configure LDAP/Active Directory Authentication or Configure OIDC Provider Authentication.

A screenshot of the Harbor website showcasing the configuration panel.

Implement vulnerability scanning: Ensure vulnerability scanning is enabled in values.yaml. Harbor uses Trivy by default. Verify its activation and configuration within the Helm chart.

trivy:
 enabled: true
A screenshot of the Harbor website showcasing the Library panel from the system admin view

Activate content trust: Harbor supports multiple content trust mechanisms to ensure the integrity of your artifacts. For modern OCI artifact signing, Cosign and Notation are recommended. Enforce deployment security at the project level within the Harbor UI or via the Harbor API to allow only verified images to be deployed. This ensures that only trusted and cryptographically signed images can be deployed.

A screenshot of the library panel on the Harbor website from the system admin view. This image shows the project registry information, proxy cache and deployment security panel.
  • Maintain regular updates: Regularly update your Harbor Helm chart and underlying Kubernetes components to benefit from the latest security patches and bug fixes. Use helm upgrade for this purpose.
  • Use robot accounts for automation: Use robot accounts (service accounts) in automation such as CI/CD pipelines to avoid using user credentials. This ensures the robot account with the least required privileges is used to perform the specific task it has been created for, ensuring limited scope.
  • Fine grained audit log: In Harbor v2.13.0, Harbor supports the re-direction of specific events in the audit log.  For example, an “authentication failure”  event can be configured in the audit log and forwarded to a 3rd party syslog endpoint.

3. Storage considerations

Efficient and reliable storage is critical for Harbor’s performance and stability.

  • Choose appropriate storage type: Define Kubernetes StorageClasses that align with your underlying infrastructure (e.g., nfs-client, aws-ebs, azure-disk, gcp-pd). Specify these settings in your values.yaml: 
persistence:
  enabled: true
  resourcePolicy: "keep"
  imageChartStorage:
    #Specify storage type: "filesystem", "azure", "gcs", "s3", "swift", "oss"
    type: ""
    #Configure specific storage type section based on the selected option
  • Estimate storage sizing: Carefully calculate your storage needs based on the anticipated number and size of container images, as well as your defined retention policies. Configure the size for your PersistentVolumeClaims in values.yaml.
  • Implement robust backup and recovery: Establish a comprehensive backup strategy for all Harbor data. For Kubernetes-native backups, consider using tools like Velero to back up PersistentVolumes and Kubernetes resources. For object storage, leverage the cloud provider’s backup mechanisms or external backup solutions. Regularly test your recovery procedures.
  • Configure and run garbage collection: Set up and routinely execute Harbor’s garbage collection. This can be configured through the Harbor UI by defining a schedule for automated runs to remove unused blobs and efficiently reclaim storage space.
A screenshot of the Harbor website showcasing the 'Clean Up' panel including garbage collection and log rotation.

4. Monitoring and alerting

Proactive monitoring and alerting are essential for identifying and addressing issues before they impact users.
Collect Comprehensive Metrics: Deploy Prometheus and configure it to scrape metrics from Harbor components. The Harbor Helm chart exposes Prometheus-compatible endpoints in the values.yaml file. Visualize these metrics using Grafana.

metrics:
  enabled: true
  core:
    path: /metrics
    port: 8001
  registry:
    path: /metrics
    port: 8001
  jobservice:
    path: /metrics
    port: 8001
  exporter:
    path: /metrics
    port: 8001
  serviceMonitor:
    enabled: true
    # This label ensures the prometheus operator picks up these monitors
    additionalLabels:
      release: kube-prometheus-stack

# Example Service Monitor objects:

# Harbor Core (API and Auth Performance)
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: harbor-core
  labels:
    app: harbor
    release: kube-prometheus-stack
spec:
  selector:
    matchLabels:
      app: harbor
      component: core
  endpoints:
  - port: metrics # Defaults to 8001
    path: /metrics
    interval: 30s

# Harbor Exporter (Business Metrics)
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: harbor-exporter
  labels:
    app: harbor
    release: kube-prometheus-stack
spec:
  selector:
    matchLabels:
      app: harbor
      component: exporter
  endpoints:
  - port: metrics
    path: /metrics
    interval: 60s # Scraped less frequently as these are high-level stats
  • Centralized logging: Implement a centralized logging solution within Kubernetes, such as the ELK stack (Elasticsearch, Logstash, Kibana) or Grafana with Fluentd/Fluent Bit. 
  • Configure critical alerts: Set up alerting rules in Prometheus (Alertmanager) or Grafana for critical events, such as component failures, high resource utilization (CPU/memory limits), storage nearing capacity, failed vulnerability scans, or unauthorized access attempts. Define these thresholds based on your production requirements.

5. Network configuration

Proper network configuration ensures smooth communication between Harbor components and external clients.

  • Configure ingress or load balancer and DNS resolution: As already mentioned, deploy a Kubernetes Ingress controller or Load Balancer to expose Harbor externally. Ensure proper DNS records are configured to point to your Load Balancer’s IP address.
  • Set Up proxy settings (if applicable): If Harbor components need to access external resources through a corporate proxy, configure proxy settings within values.yaml. It’s crucial to note that the proxy.components field explicitly defines which Harbor components (e.g., core, jobservice, trivy) will utilize these proxy settings for their external communications.
proxy:
  httpProxy:
  httpsProxy:
  noProxy: 127.0.0.1,localhost,.local,.internal
  components:
    - core
    - jobservice
    - trivy
  • Allocate sufficient bandwidth: Ensure your Kubernetes cluster’s underlying network infrastructure and nodes have sufficient bandwidth to handle peak image pushes and pulls. Monitor network I/O on nodes running Harbor pods.

Conclusion

By diligently addressing these considerations, you can transform your basic Harbor deployment into a robust, secure, and highly available production-ready container registry. This approach ensures that Harbor serves as a cornerstone of your cloud-native infrastructure, capable of supporting demanding development and production workflows. From implementing High Availability and stringent security measures to optimizing storage and establishing proactive monitoring, each step contributes to a resilient and efficient artifact management system. 

Continue reading the Harbor Blog Series on cncf.io:

Blog 1 – Harbor: Enterprise-grade container registry for modern private cloud

Blog 2 – Deploying Harbor on Kubernetes using Helm

Categories: CNCF Projects

Announcing Kyverno 1.17!

CNCF Blog Projects Category - Wed, 02/18/2026 - 07:11

Kyverno 1.17 is a landmark release that marks the stabilization of our next-generation Common Expression Language (CEL) policy engine.

While 1.16 introduced the “CEL-first” vision in beta, 1.17 promotes these capabilities to v1, offering a high-performance, future-proof path for policy as code.

This release focuses on “completing the circle” for CEL policies by introducing namespaced mutation and generation, expanding the available function libraries for complex logic, and enhancing supply chain security with upcoming Cosign v3 support.

A new look for kyverno.io

The first thing you’ll notice with 1.17 is our completely redesigned website. We’ve moved beyond a simple documentation site to create a modern, high-performance portal for platform engineers.Let’s be honest: the Kyverno website redesign was long overdue. As the project evolved into the industry standard for unified policy as code, our documentation needs to reflect that maturity. We are proud to finally unveil the new experience at https://kyverno.io.

 A new era, a new website' showcasing the changes from the old Kyverno website to the new website, with Kyverno 1.17.

  • Modern redesign
    Built on the Starlight framework, the new site is faster, fully responsive, and features a clean, professional aesthetic that makes long-form reading much easier on the eyes.
  • Enhanced documentation structure
    We’ve reorganized our docs from the ground up. Information is now tiered by “User Journey”—from a simplified Quick Start for beginners to deep-dive Reference material for advanced policy authors.
  • Fully redesigned policy catalog
    Our library of 300+ sample policies has a new interface. It features improved filtering and a dedicated search that allows you to find policies by Category (Best Practices, Security, etc.) or Type (CEL vs. JMESPath) instantly.
  • Enhanced search capabilities
    We’ve integrated a more intelligent search engine that indexes both documentation and policy code, ensuring you get the right answer on the first try.
  • Brand new blog
    The Kyverno blog has been refreshed to better showcase technical deep dives, community case studies, and release announcements like this one!

Namespaced mutating and generating policies

In 1.16, we introduced namespaced variants for validation, cleanup, and image verification.

Kyverno 1.17 completes this by adding:

  • NamespacedMutatingPolicy
  • NamespacedGeneratingPolicy

This enables true multi-tenancy. Namespace owners can now define their own mutation and generation logic (e.g., automatically injecting sidecars or creating default ConfigMaps) without requiring cluster-wide permissions or affecting other tenants.

CEL policy types reach v1 (GA)

The headline for 1.17 is the promotion of CEL-based policy types to v1. This signifies that the API is now stable and production-ready.

The promotion includes:

  • ValidatingPolicy and NamespacedValidatingPolicy
  • MutatingPolicy and NamespacedMutatingPolicy
  • GeneratingPolicy and NamespacedGeneratingPolicy
  • ImageValidatingPolicy and NamespacedImageValidatingPolicy
  • DeletingPolicy and NamespacedDeletingPolicy
  • PolicyException

With this graduation, platform teams can confidently migrate from JMESPath-based policies to CEL to take advantage of significantly improved evaluation performance and better alignment with upstream Kubernetes ValidatingAdmissionPolicies / MutatingAdmissionPolicies.

New CEL capabilities and functions

To ensure CEL policies are as powerful as the original Kyverno engine, 1.17 introduces several new function libraries:

  • Hash Functions
    Built-in support for md5(value), sha1(value), and sha256(value) hashing.
  • Math Functions
    Use math.round(value, precision) to round numbers to a specific decimal or integer precision.
  • X509 Decoding
    Policies can now inspect and validate the contents of x509 certificates directly within a CEL expression using x509.decode(pem).
  • Random String Generation
    Generate random strings with random() (default pattern) or random(pattern) for custom regex-based patterns.
  • Transform Utilities
    Use listObjToMap(list1, list2, keyField, valueField) to merge two object lists into a map.
  • JSON Parsing
    Parse JSON strings into structured data with json.unmarshal(jsonString).
  • YAML Parsing
    Parse YAML strings into structured data with yaml.parse(yamlString).
  • Time-based Logic
    New time.now(), time.truncate(timestamp, duration), and time.toCron(timestamp) functions allow for time-since or “maintenance window” style policies.

The deprecation of legacy APIs

As Kyverno matures and aligns more closely with upstream Kubernetes standards, we are making the strategic shift to a CEL-first architecture. This means that the legacy Policy and ClusterPolicy types (which served the community for years using JMESPath) are now entering their sunset phase.

The deprecation schedule

Kyverno 1.17 officially marks ClusterPolicy and CleanupPolicy as Deprecated. While they remain functional in this release, the clock has started on their removal to make way for the more performant, standardized CEL-based engines.

ReleaseDate (estimated)Statusv1.17Jan 2026Marked for deprecationv1.18Apr 2026Critical fixes onlyv1.19Jul 2026Critical fixes onlyv1.20Oct 2026Planned for removal

Why the change?

By standardizing on the Common Expression Language (CEL), Kyverno significantly improves its performance and aligns with the native validation logic used by the Kubernetes API server itself.

For platform teams, this means one less language to learn and a more predictable and scalable policy-as-code experience.

Note for authors

From this point forward, we strongly recommend that every new policy you write be based on the new CEL APIs. Choosing the legacy APIs for new work today simply adds to your migration workload later this year.

Migration tips

We understand that many of you have hundreds of existing policies. To ensure a smooth transition, we have provided comprehensive resources:

  • The Migration Guide
    Our new Migration to CEL Guide provides a side-by-side mapping of legacy ClusterPolicy fields to their new equivalents (e.g., mapping validate.pattern to ValidatingPolicy expressions).
  • New Policy Types
    You can now begin moving your rules into specialized types like ValidatingPolicy, MutatingPolicy, and GeneratingPolicy. You can see the full breakdown of these new v1 APIs in the Policy Types Overview.

Enhanced supply chain security

Supply chain security remains a core pillar of Kyverno.

  • Cosign v3 Support
    1.17 adds support for the latest Cosign features, ensuring your image verification remains compatible with the evolving Sigstore ecosystem.
  • Expanded Attestation Parsing
    New capabilities to deserialize YAML and JSON strings within CEL policies make it easier to verify complex metadata and SBOMs.

Observability and reporting upgrades

We have refined how Kyverno communicates policy results:

  • Granular Reporting Control
    A new –allowedResults flag allows you to filter which results (e.g., only “Fail”) are stored in reports, significantly reducing ETCD pressure in large clusters.
  • Enhanced Metrics
    More detailed latency and execution metrics for CEL policies are now included by default to help you monitor the “hidden” cost of policy enforcement.

For developers and integrators

To support the broader ecosystem and make it easier to build integrations, we have decoupled our core components:

  • New API Repository
    Our CEL-based APIs now live in a dedicated repository: kyverno/api. This makes it significantly lighter to import Kyverno types into your own Go projects.
  • Kyverno SDK
    For developers building custom controllers or tools that interact with Kyverno, the SDK project is now housed at kyverno/sdk.

Getting started and backward compatibility

Upgrading from 1.16 is straightforward. However, since the CEL policy types have moved to v1, we recommend updating your manifests to the new API version. Kyverno will continue to support v1beta1 for a transition period.

helm repo update
helm upgrade --install kyverno kyverno/kyverno -n kyverno --version 3.7.0

Looking ahead: The Kyverno roadmap

As we move past the 1.17 milestone, our focus shifts toward long-term sustainability and the “Kyverno Platform” experience. Our goal is to ensure that Kyverno remains the most user-friendly and performant governance tool in the cloud-native ecosystem.

  • Growing the community
    We are doubling down on our commitment to the community. Expect more frequent office hours, improved contributor onboarding, and a renewed focus on making the Kyverno community the most welcoming space in CNCF.
  • A unified tooling experience
    Over the years, we’ve built several powerful sub-projects (like the CLI, Policy Reporter, and Kyverno-Authz). A major goal on our roadmap is to unify these tools into a cohesive experience, reducing fragmentation and making it easier to manage the entire policy lifecycle from a single vantage point.
  • Performance and scalability guardrails
    As clusters grow, performance becomes paramount. We are shifting our focus toward rigorous automated performance testing and will be providing more granular metrics regarding throughput and latency. We want to give platform engineers the data they need to understand exactly what Kyverno can handle in high-scale production environments.
  • Continuous UX improvement
    The website redesign was just the first step. We will continue to iterate on our user interfaces, documentation, and error messaging to ensure that Kyverno remains “Simplified” by design, not just in name.

Conclusion

Kyverno 1.17 is the most robust version yet, blending the flexibility of our original engine with the performance and standardization of CEL.

But this release is about more than just code—it’s about the total user experience. Whether you’re browsing the new policy catalog or scaling thousands of CEL-based rules, we hope this release makes your Kubernetes journey smoother.

A massive thank you to our contributors for making this release (and the new website!) a reality.

Categories: CNCF Projects

Modernizing Prometheus: Native Storage for Composite Types

Prometheus Blog - Fri, 02/13/2026 - 19:00

Over the last year, the Prometheus community has been working hard on several interesting and ambitious changes that previously would have been seen as controversial or not feasible. While there might be little visibility about those from the outside (e.g., it's not an OpenClaw Prometheus plugin, sorry ?), Prometheus developers are, organically, steering Prometheus into a certain, coherent future. Piece by piece, we unexpectedly get closer to goals we never dreamed we would achieve as an open-source project!

This post starts (hopefully!) as a series of blog posts that share a few ambitious shifts that might be exciting to new and existing Prometheus users and developers. In this post, I'd love to focus on the idea of native storage for the composite types which is tidying up a lot of challenges that piled up over time. Make sure to check the provided inlined links on how you can adopt some of those changes early or contribute!

CAUTION: Disclaimer: This post is intended as a fun overview, from my own personal point of view as a Prometheus maintainer. Some of the mentioned changes haven't been (yet) officially approved by the Prometheus Team; some of them were not proved in production.

NOTE: This post was written by humans; AI was used only for cosmetic and grammar fixes.

Classic Representation: Primitive Samples

As you might know, the Prometheus data model (so server, PromQL, protocols) supports gauges, counters, histograms and summaries. OpenMetrics 1.0 extended this with gaugehistogram, info and stateset types.

Impressively, for a long time Prometheus' TSDB storage implementation had an explicitly clean and simple data model. The TSDB allowed the storage and retrieval of string-labelled primitive samples containing only float64 values and int64 timestamps. It was completely metric-type-agnostic.

The metric types were implied on top of the TSDB, for humans and best effort tooling for PromQL. For simplicity, let's call this way of storing types a classic model or representation. In this model:

We have primitive types:

  • gauge is a "default" type with no special rules, just a float sample with labels.

  • counter that should have a _total suffix in the name for humans to understand its semantics.

    foo_total 17.0
    
  • info that needs an _info suffix in the metric name and always has a value of 1.

We have composite types. This is where the fun begins. In the classic representation, composite metrics are represented as a set of primitive float samples:

  • histogram is a group of counters with certain mandatory suffixes and le labels:

    foo_bucket{le="0.0"} 0
    foo_bucket{le="1e-05"} 0
    foo_bucket{le="0.0001"} 5
    foo_bucket{le="0.1"} 8
    foo_bucket{le="1.0"} 10
    foo_bucket{le="10.0"} 11
    foo_bucket{le="100000.0"} 11
    foo_bucket{le="1e+06"} 15
    foo_bucket{le="1e+23"} 16
    foo_bucket{le="1.1e+23"} 17
    foo_bucket{le="+Inf"} 17
    foo_count 17
    foo_sum 324789.3
    
  • gaugehistogram, summary, and stateset types follow the same logic – a group of special gauges or counters that compose a single metric.

The classic model served the Prometheus project well. It significantly simplified the storage implementation, enabling Prometheus to be one of the most optimized, open-source time-series databases, with distributed versions based on the same data model available in projects like Cortex, Thanos, and Mimir, etc.

Unfortunately, there are always tradeoffs. This classic model has a few limitations:

  • Efficiency: It tends to yield overhead for composite types because every new piece of data (e.g., new bucket) takes precious index space (it's a new unique series), whereas samples are significantly more compressible (rarely change, time-oriented).
  • Functionality: It poses limitations to the shape and flexibility of the data you store (unless we'd go into some JSON-encoded labels, which have massive downsides).
  • Transactionality: Primitive pieces of composite types (separate counters) are processed independently. While we did a lot of work to ensure write isolation and transactionality for scrapes, transactionality completely breaks apart when data is received or sent via remote write, OTLP protocols, or, to distributed long-term storage, Prometheus solutions. For example, a foo histogram might have been partially sent, but its foo_bucket{le="1.1e+23"} 17 counter series be delayed or dropped accidentally, which risks triggering false positive alerts or no alerts, depending on the situation.
  • Reliability: Consumers of the TSDB data have to essentially guess the type semantics. There's nothing stopping users from writing a foo_bucket gauge or foo_total histogram.

A Glimpse of Native Storage for Composite Types

The classic model was challenged by the introduction of native histograms. The TSDB was extended to store composite histogram samples other than float. We tend to call this a native histogram, because TSDB can now "natively" store a full (with sparse and exponential buckets) histogram as an atomic, composite sample.

At that point, the common wisdom was to stop there. The special advanced histogram that's generally meant to replace the "classic" histograms uses a composite sample, while the rest of the metrics use the classic model. Making other composite types consistent with the new native model felt extremely disruptive to users, with too much work and risks. A common counter-argument was that users will eventually migrate their classic histograms naturally, and summaries are also less useful, given the more powerful bucketing and lower cost of native histograms.

Unfortunately, the migration to native histograms was known to take time, given the slight PromQL change required to use them, and the new bucketing and client changes needed (applications have to define new or edit existing metrics to new histograms). There will also be old software used for a long time that never is never migrated. Eventually, it leaves Prometheus with no chance of deprecating classic histograms, with all the software solutions required to support the classic model, likely for decades.

However, native histograms did push TSDB and the ecosystem into that new composite sample pattern. Some of those changes could be easily adapted to all composite types. Native histograms also gave us a glimpse of the many benefits of that native support. It was tempting to ask ourselves: would it be possible to add native counterparts of the existing composite metrics to replace them, ideally transparently?

Organically, in 2024, for transactionality and efficiency, we introduced a native histogram custom buckets(NHCB) concept that essentially allows storing classic histograms with explicit buckets natively, reusing native histogram composite sample data structures.

NHCB has proven to be at least 30% more efficient than the classic representation, while offering functional parity with classic histograms. However, two practical challenges emerged that slowed down the adoption:

  1. Expanding, that is converting from NHCB to classic histogram, is relatively trivial, but combining, which is turning a classic histogram into NHCB, is often not feasible. We don't want to wait for client ecosystem adoption, and also being mindful of legacy, hard to change software, we envisioned NHCB being converted (so combined) on scrape from the classic representation. That has proven to be somewhat expensive on scrape. Additionally, combination logic is practically impossible when receiving "pushes" (e.g., remote write with classic histograms), as you could end up having different parts of the same histogram sample (e.g., buckets and count) sent via different remote write shards or sequential messages. This combination challenge is also why OpenTelemetry collector users see an extra overhead on prometheusreceiver as the OpenTelemetry model strictly follows the composite sample model.

  2. Consumption is slightly different, especially in the PromQL query syntax. Our initial decision was to surface NHCB histograms using a native-histogram-like PromQL syntax. For example the following classic histogram:

    foo_bucket{le="0.0"} 0
    # ...
    foo_bucket{le="1.1e+23"} 17
    foo_bucket{le="+Inf"} 17
    foo_count 17
    foo_sum 324789.3
    

    When we convert this to NHCB, you can no longer use foo_bucket as your metric name selector. Since NHCB is now stored as a foo metric, you need to use:

    histogram_quantile(0.9, sum(foo{job="a"}))
    
    # Old syntax: histogram_quantile(0.9, sum(foo_bucket{job="a"}) by (le))
    

    This has also another effect. It violates our "what you see is what you query" rule for the text formats, at least until OpenMetrics 2.

    On top of that, similar problems occur on other Prometheus outputs (federation, remote read, and remote write).

NOTE: Fun fact: Prometheus client data model (SDKs) and PrometheusProto scrape protocol use the composite sample model already!

Transparent Native Representation

Let's get straight to the point. Organically, the Prometheus community seems to align with the following two ideas:

  • We want to eventually move to a fully composite sample model on the storage layer, given all the benefits.
  • Users needs to be able to switch (e.g., on scrape) from classic to native form in storage without breaking consumption layer. Essentially to help with non-trivial migration pains (finding who use what, double-writing, synchronizing), avoiding tricky, dual mode, protocol changes and to deprecate the classic model ASAP for the sustainability of the Prometheus codebase, we need to ensure eventual consumption migration e.g., PromQL queries -- independently to the storage layer.

Let's go through evidence of this direction, which also represents efforts you can contribute to or adopt early!

  1. We are discussing the "native" summary and stateset to fully eliminate classic model for all composite types. Feel free to join and help on that work!

  2. We are working on the OpenMetrics 2.0 to consolidate and improve the pull protocol scene and apply the new learnings. One of the core changes will be the move to composite values in text, which makes the text format trivial to parse for storages that support composite types natively. This solves the combining challenge. Note that, by default, for now, all composite types will be still "expanded" to classic format on scrape, so there's no breaking change for users. Feel free to join our WG to help or give feedback.

  3. Prometheus receive and export protocol has been updated. Remote Write 2.0 allows transporting histograms in the "native" form instead of a classic representation (classic one is still supported). In the future versions (e.g. 2.1), we could easily follow a similar pattern and add native summaries and stateset. Contributions are welcome to make Remote Write 2.0 stable!

  4. We are experimenting with the consumption compatibility modes that translate the composite types store as composite samples to classic representation. This is not trivial; there are edge cases, but it might be more feasible (and needed!) than we might have initially anticipated. See:

    In PromQL it might work as follows, for an NHCB that used to be a classic histogram:

    # New syntax gives our "foo" NHCB:    
    histogram_quantile(0.9, sum(foo{job="a"}))
    # Old syntax still works, expanding "foo" NHCB to classic representation:
    histogram_quantile(0.9, sum(foo_bucket{job="a"}) by (le))
    

    Alternatives, like a special label or annotations, are also discussed.

When implemented, it should be possible to fully switch different parts of your metric collection pipeline to native form transparently.

Summary

Moving Prometheus to a native composite type world is not easy and will take time, especially around coding, testing and optimizing. Notably it switches performance characteristics of the metric load from uniform, predictable sample sizes to a sample size that depends on a type. Another challenge is code architecture - maintaining different sample types has already proven to be very verbose (we need unions, Go!).

However, recent work revealed a very clean and possible path that yields clear benefits around functionality, transactionality, reliability, and efficiency in the relatively near future, which is pretty exciting!

If you have any questions around these changes, feel free to:

  • DM me on Slack.
  • Visit the #prometheus-dev Slack channel and share your questions.
  • Comment on related issues, create PRs, also review PRs (the most impactful work!)

The Prometheus community is also at KubeConEU 2026 in Amsterdam! Make sure to:

I'm hoping we can share stories of other important, orthogonal shifts we see in the community in future posts. No promises (and help welcome!), but there's a lot to cover, such as (random order, not a full list):

  1. Our native start timestamp feature journey that cleanly unblocks native delta temporality without "hacks" like reusing gauges, separate layer of metric types or label annotations e,g., __temporality__.
  2. Optional schematization of Prometheus metrics that attempt to solve a ton of stability problems with metric naming and shape; building on top of OpenTelemetry semconv.
  3. Our metadata storage journey that attempts to improve the OpenTelemetry Entities and resource attributes storage and consumption experience.
  4. Our journey to organize and extend Prometheus scrape pull protocols with the recent ownership move of OpenMetrics.
  5. An incredible TSDB Parquet effort, coming from the three LTS project groups (Cortex, Thanos, Mimir) working together, attempting to improve high-cardinality cases.
  6. Fun experiments with PromQL extensions, like PromQL with pipes and variables and some new SQL transpilation ideas.
  7. Governance changes.

See you in open-source!

Categories: CNCF Projects

Spotlight on SIG Architecture: API Governance

Kubernetes Blog - Wed, 02/11/2026 - 19:00

This is the fifth interview of a SIG Architecture Spotlight series that covers the different subprojects, and we will be covering SIG Architecture: API Governance.

In this SIG Architecture spotlight we talked with Jordan Liggitt, lead of the API Governance sub-project.

Introduction

FM: Hello Jordan, thank you for your availability. Tell us a bit about yourself, your role and how you got involved in Kubernetes.

JL: My name is Jordan Liggitt. I'm a Christian, husband, father of four, software engineer at Google by day, and amateur musician by stealth. I was born in Texas (and still like to claim it as my point of origin), but I've lived in North Carolina for most of my life.

I've been working on Kubernetes since 2014. At that time, I was working on authentication and authorization at Red Hat, and my very first pull request to Kubernetes attempted to add an OAuth server to the Kubernetes API server. It never exited work-in-progress status. I ended up going with a different approach that layered on top of the core Kubernetes API server in a different project (spoiler alert: this is foreshadowing), and I closed it without merging six months later.

Undeterred by that start, I stayed involved, helped build Kubernetes authentication and authorization capabilities, and got involved in the definition and evolution of the core Kubernetes APIs from early beta APIs, like v1beta3 to v1. I got tagged as an API reviewer in 2016 based on those contributions, and was added as an API approver in 2017.

Today, I help lead the API Governance and code organization subprojects for SIG Architecture, and I am a tech lead for SIG Auth.

FM: And when did you get specifically involved in the API Governance project?

JL: Around 2019.

Goals and scope of API Governance

FM: How would you describe the main goals and areas of intervention of the subproject?

The surface area includes all the various APIs Kubernetes has, and there are APIs that people do not always realize are APIs: command-line flags, configuration files, how binaries are run, how they talk to back-end components like the container runtime, and how they persist data. People often think of "the API" as only the REST API... that is the biggest and most obvious one, and the one with the largest audience, but all of these other surfaces are also APIs. Their audiences are narrower, so there is more flexibility there, but they still require consideration.

The goals are to be stable while still enabling innovation. Stability is easy if you never change anything, but that contradicts the goal of evolution and growth. So we balance "be stable" with "allow change".

FM: Speaking of changes, in terms of ensuring consistency and quality (which is clearly one of the reasons this project exists), what are the specific quality gates in the lifecycle of a Kubernetes change? Does API Governance get involved during the release cycle, prior to it through guidelines, or somewhere in between? At what points do you ensure the intended role is fulfilled?

JL: We have guidelines and conventions, both for APIs in general and for how to change an API. These are living documents that we update as we encounter new scenarios. They are long and dense, so we also support them with involvement at either the design stage or the implementation stage.

Sometimes, due to bandwidth constraints, teams move ahead with design work without feedback from API Review. That’s fine, but it means that when implementation begins, the API review will happen then, and there may be substantial feedback. So we get involved when a new API is created or an existing API is changed, either at design or implementation.

FM: Is this during the Kubernetes Enhancement Proposal (KEP) process? Since KEPs are mandatory for enhancements, I assume part of the work intersects with API Governance?

JL: It can. KEPs vary in how detailed they are. Some include literal API definitions. When they do, we can perform an API review at the design stage. Then implementation becomes a matter of checking fidelity to the design.

Getting involved early is ideal. But some KEPs are conceptual and leave details to the implementation. That’s not wrong; it just means the implementation will be more exploratory. Then API Review gets involved later, possibly recommending structural changes.

There’s a trade-off regardless: detailed design upfront versus iterative discovery during implementation. People and teams work differently, and we’re flexible and happy to consult early or at implementation time.

FM: This reminds me of what Fred Brooks wrote in "The Mythical Man-Month" about conceptual integrity being central to product quality... No matter how you structure the process, there must be a point where someone looks at what is coming and ensures conceptual integrity. Kubernetes uses APIs everywhere -- externally and internally -- so API Governance is critical to maintaining that integrity. How is this captured?

JL: Yes, the conventions document captures patterns we’ve learned over time: what to do in various situations. We also have automated linters and checks to ensure correctness around patterns like spec/status semantics. These automated tools help catch issues even when humans miss them.

As new scenarios arise -- and they do constantly -- we think through how to approach them and fold the results back into our documentation and tools. Sometimes it takes a few attempts before we settle on an approach that works well.

FM: Exactly. Each new interaction improves the guidelines.

JL: Right. And sometimes the first approach turns out to be wrong. It may take two or three iterations before we land on something robust.

The impact of Custom Resource Definitions

FM: Is there any particular change, episode, or domain that stands out as especially noteworthy, complex, or interesting in your experience?

JL: The watershed moment was Custom Resources. Prior to that, every API was handcrafted by us and fully reviewed. There were inconsistencies, but we understood and controlled every type and field.

When Custom Resources arrived, anyone could define anything. The first version did not even require a schema. That made it extremely powerful -- it enabled change immediately -- but it left us playing catch-up on stability and consistency.

When Custom Resources graduated to General Availability (GA), schemas became required, but escape hatches still existed for backward compatibility. Since then, we’ve been working on giving CRD authors validation capabilities comparable to built-ins. Built-in validation rules for CRDs have only just reached GA in the last few releases.

So CRDs opened the "anything is possible" era. Built-in validation rules are the second major milestone: bringing consistency back.

The three major themes have been defining schemas, validating data, and handling pre-existing invalid data. With ratcheting validation (allowing data to improve without breaking existing objects), we can now guide CRD authors toward conventions without breaking the world.

API Governance in context

FM: How does API Governance relate to SIG Architecture and API Machinery?

JL: API Machinery provides the actual code and tools that people build APIs on. They don’t review APIs for storage, networking, scheduling, etc.

SIG Architecture sets the overall system direction and works with API Machinery to ensure the system supports that direction. API Governance works with other SIGs building on that foundation to define conventions and patterns, ensuring consistent use of what API Machinery provides.

FM: Thank you. That clarifies the flow. Going back to release cycles: do release phases -- enhancements freeze, code freeze -- change your workload? Or is API Governance mostly continuous?

JL: We get involved in two places: design and implementation. Design involvement increases before enhancements freeze; implementation involvement increases before code freeze. However, many efforts span multiple releases, so there is always some design and implementation happening, even for work targeting future releases. Between those intense periods, we often have time to work on long-term design work.

An anti-pattern we see is teams thinking about a large feature for months and then presenting it three weeks before enhancements freeze, saying, "Here is the design, please review." For big changes with API impact, it’s much better to involve API Governance early.

And there are good times in the cycle for this -- between freezes -- when people have bandwidth. That’s when long-term review work fits best.

Getting involved

FM: Clearly. Now, regarding team dynamics and new contributors: how can someone get involved in API Governance? What should they focus on?

JL: It’s usually best to follow a specific change rather than trying to learn everything at once. Pick a small API change, perhaps one someone else is making or one you want to make, and observe the full process: design, implementation, review.

High-bandwidth review -- live discussion over video -- is often very effective. If you’re making or following a change, ask whether there’s a time to go over the design or PR together. Observing those discussions is extremely instructive.

Start with a small change. Then move to a bigger one. Then maybe a new API. That builds understanding of conventions as they are applied in practice.

FM: Excellent. Any final comments, or anything we missed?

JL: Yes... the reason we care so much about compatibility and stability is for our users. It’s easy for contributors to see those requirements as painful obstacles preventing cleanup or requiring tedious work... but users integrated with our system, and we made a promise to them: we want them to trust that we won’t break that contract. So even when it requires more work, moves slower, or involves duplication, we choose stability.

We are not trying to be obstructive; we are trying to make life good for users.

A lot of our questions focus on the future: you want to do something now... how will you evolve it later without breaking it? We assume we will know more in the future, and we want the design to leave room for that.

We also assume we will make mistakes. The question then is: how do we leave ourselves avenues to improve while keeping compatibility promises?

FM: Exactly. Jordan, thank you, I think we’ve covered everything. This has been an insightful view into the API Governance project and its role in the wider Kubernetes project.

JL: Thank you.

Categories: CNCF Projects, Kubernetes

Linkerd Protocol Detection

Linkerd Blog - Sun, 02/08/2026 - 19:00

This blog post was originally published on the OneUptime blog. The cover photo is derived from an image by OpenClipart-Vectors.

Linkerd is a lightweight service mesh that provides observability, reliability, and security for Kubernetes applications. One of its powerful features is automatic protocol detection, which allows Linkerd to identify the protocol being used by incoming connections without requiring explicit configuration.

This automatic detection enables Linkerd to apply protocol-specific features like HTTP metrics, retries, and load balancing strategies without manual annotation of every service.

Categories: CNCF Projects

Dragonfly v2.4.0 is released

CNCF Blog Projects Category - Thu, 02/05/2026 - 19:00

Dragonfly v2.4.0 is released! Thanks to all of the contributors who made this Dragonfly release happen.

New features and enhancements

load-aware scheduling algorithm

A two-stage scheduling algorithm combining central scheduling with node-level secondary scheduling to optimize P2P download performance, based on real-time load awareness.

 Parent A (40%), Parent B (35%), Parent N (n%). As an image, it shows a two-stage scheduling algorithm combining central scheduling with node-level secondary scheduling to optimize P2P download performance, based on real-time load awareness.

For more information, please refer to the Scheduling.

Vortex protocol support for P2P file transfer

Dragonfly provides the new Vortex transfer protocol based on TLV to improve the download performance in the internal network. Use the TLV (Tag-Length-Value) format as a lightweight protocol to replace gRPC for data transfer between peers. TCP-based Vortex reduces large file download time by 50% and QUIC-based Vortex by 40% compared to gRPC, both effectively reducing peak memory usage.

For more information, please refer to the TCP Protocol Support for P2P File Transfer and QUIC Protocol Support for P2P File Transfer.

Request SDK

A SDK for routing User requests to Seed Peers using consistent hashing, replacing the previous Kubernetes Service load balancing approach.

Flow chart image of the Request SDK, showing the flow between the user, via the request, to the request SDK. From there it filters through chunk 1, chunk 2 and chunk 3 to seed peer 2. From there, it navigated through layer 1 to the OCI registry,

Simple multi‑cluster Kubernetes deployment with scheduler cluster ID

Dragonfly supports a simplified feature for deploying and managing multiple Kubernetes clusters by explicitly assigning a schedulerClusterID to each cluster. This approach allows users to directly control cluster affinity without relying on location‑based scheduling metadata such as IDC, hostname, or IP.

Using this feature, each Peer, Seed Peer, and Scheduler determines its target scheduler cluster through a clearly defined scheduler cluster ID. This ensures precise separation between clusters and predictable cross‑cluster behavior.

A screenshot of the host scheduler cluster ID process. Showing 5 lines of code.

For more information, please refer to the Create Dragonfly Cluster Simple.

Performance and resource optimization for Manager and Scheduler components

Enhanced service performance and resource utilization across Manager and Scheduler components while significantly reducing CPU and memory overhead, delivering improved system efficiency and better resource management.

Enhanced preheating

  • Support for IP-based peer selection in preheating jobs with priority-based selection logic where IP specification takes highest priority, followed by count-based and percentage-based selection.
  • Support for preheating multiple URLs in a single request.
  • Support for preheating file and image via Scheduler gRPC interface.

A screenshot of the Dragonfly operating system. It show the form for 'Create Preheat' including fields for information, clusters, url, and Args.

Calculate task ID based on image blob SHA256 to avoid redundant downloads

The Client now supports calculating task IDs directly from the SHA256 hash of image blobs, instead of using the download URL. This enhancement prevents redundant downloads and data duplication when the same blob is accessed from different registry domains.

Cache HTTP 307 redirects for split downloads

Support for caching HTTP 307 (Temporary Redirect) responses to optimize Dragonfly’s multi-piece download performance. When a download URL is split into multiple pieces, the redirect target is now cached, eliminating redundant redirect requests and reducing latency.

Go Client deprecated and replaced by Rust client

The Go client has been deprecated and replaced by the Rust Client. All future development and maintenance will focus exclusively on the Rust client, which offers improved performance, stability, and reliability.

For more information, please refer to the dragoflyoss/client.

Additional enhancements

  • Enable 64K page size support for ARM64 in the Dragonfly Rust client.
  • Fix missing git commit metadata in dfget version output.
  • Support for config_path of io.containerd.cri.v1.images plugin for containerd V3 configuration.
  • Replaces glibc DNS resolver with hickory-dns in reqwest to implement DNS caching and prevent excessive DNS lookups during piece downloads.
  • Support for the –include-files flag to selectively download files from a directory.
  • Add the –no-progress flag to disable the download progress bar output.
  • Support for custom request headers in backend operations, enabling flexible header configuration for HTTP requests.
  • Refactored log output to reduce redundant logging and improve overall logging efficiency.

Significant bug fixes

  • Modified the database field type from text to longtext to support storing the information of preheating job.
  • Fixed panic on repeated seed peer service stops during Scheduler shutdown.
  • Fixed broker authentication failure when specifying the Redis password without setting a username.

Nydus

New features and enhancements

  • Nydusd: Add CRC32 validation support for both RAFS V5 and V6 formats, enhancing data integrity verification.
  • Nydusd: Support resending FUSE requests during nydusd restoration, improving daemon recovery reliability.
  • Nydusd: Enhance VFS state saving mechanism for daemon hot upgrade and failover.
  • Nydusify: Introduce Nydus-to-OCI reverse conversion capability, enabling seamless migration back to OCI format.
  • Nydusify: Implement zero-disk transfer for image copy, significantly reducing local disk usage during copy operations.
  • Snapshotter: Builtin blob.meta in bootstrap for blob fetch reliability for RAFS v6 image.

Significant bug fixes

  • Nydusd: Fix auth token fetching for access_token field in registry authentication.
  • Nydusd: Add recursive inode/dentry invalidation for umount API.
  • Nydus Image: Fix multiple issues in optimize subcommand and add backend configuration support.
  • Snapshotter: Implement lazy parent recovery for proxy mode to handle missing parent snapshots.

We encourage you to visit the d7y.io website to find out more.

Others

You can see CHANGELOG for more details.

Links

Dragonfly Github

The QR code to access Dragonfly's GitHub project.
Categories: CNCF Projects

Introducing Node Readiness Controller

Kubernetes Blog - Mon, 02/02/2026 - 21:00
Logo for node readiness controller

In the standard Kubernetes model, a node’s suitability for workloads hinges on a single binary "Ready" condition. However, in modern Kubernetes environments, nodes require complex infrastructure dependencies—such as network agents, storage drivers, GPU firmware, or custom health checks—to be fully operational before they can reliably host pods.

Today, on behalf of the Kubernetes project, I am announcing the Node Readiness Controller. This project introduces a declarative system for managing node taints, extending the readiness guardrails during node bootstrapping beyond standard conditions. By dynamically managing taints based on custom health signals, the controller ensures that workloads are only placed on nodes that met all infrastructure-specific requirements.

Why the Node Readiness Controller?

Core Kubernetes Node "Ready" status is often insufficient for clusters with sophisticated bootstrapping requirements. Operators frequently struggle to ensure that specific DaemonSets or local services are healthy before a node enters the scheduling pool.

The Node Readiness Controller fills this gap by allowing operators to define custom scheduling gates tailored to specific node groups. This enables you to enforce distinct readiness requirements across heterogeneous clusters, ensuring for example, that GPU equipped nodes only accept pods once specialized drivers are verified, while general purpose nodes follow a standard path.

It provides three primary advantages:

  • Custom Readiness Definitions: Define what ready means for your specific platform.
  • Automated Taint Management: The controller automatically applies or removes node taints based on condition status, preventing pods from landing on unready infrastructure.
  • Declarative Node Bootstrapping: Manage multi-step node initialization reliably, with a clear observability into the bootstrapping process.

Core concepts and features

The controller centers around the NodeReadinessRule (NRR) API, which allows you to define declarative gates for your nodes.

Flexible enforcement modes

The controller supports two distinct operational modes:

Continuous enforcement
Actively maintains the readiness guarantee throughout the node’s entire lifecycle. If a critical dependency (like a device driver) fails later, the node is immediately tainted to prevent new scheduling.
Bootstrap-only enforcement
Specifically for one-time initialization steps, such as pre-pulling heavy images or hardware provisioning. Once conditions are met, the controller marks the bootstrap as complete and stops monitoring that specific rule for the node.

Condition reporting

The controller reacts to Node Conditions rather than performing health checks itself. This decoupled design allows it to integrate seamlessly with other tools existing in the ecosystem as well as custom solutions:

  • Node Problem Detector (NPD): Use existing NPD setups and custom scripts to report node health.
  • Readiness Condition Reporter: A lightweight agent provided by the project that can be deployed to periodically check local HTTP endpoints and patch node conditions accordingly.

Operational safety with dry run

Deploying new readiness rules across a fleet carries inherent risk. To mitigate this, dry run mode allows operators to first simulate impact on the cluster. In this mode, the controller logs intended actions and updates the rule's status to show affected nodes without applying actual taints, enabling safe validation before enforcement.

Example: CNI bootstrapping

The following NodeReadinessRule ensures a node remains unschedulable until its CNI agent is functional. The controller monitors a custom cniplugin.example.net/NetworkReady condition and only removes the readiness.k8s.io/acme.com/network-unavailable taint once the status is True.

apiVersion: readiness.node.x-k8s.io/v1alpha1
kind: NodeReadinessRule
metadata:
 name: network-readiness-rule
spec:
 conditions:
 - type: "cniplugin.example.net/NetworkReady"
 requiredStatus: "True"
 taint:
 key: "readiness.k8s.io/acme.com/network-unavailable"
 effect: "NoSchedule"
 value: "pending"
 enforcementMode: "bootstrap-only"
 nodeSelector:
 matchLabels:
 node-role.kubernetes.io/worker: ""

Demo:

Getting involved

The Node Readiness Controller is just getting started, with our initial releases out, and we are seeking community feedback to refine the roadmap. Following our productive Unconference discussions at KubeCon NA 2025, we are excited to continue the conversation in person.

Join us at KubeCon + CloudNativeCon Europe 2026 for our maintainer track session: Addressing Non-Deterministic Scheduling: Introducing the Node Readiness Controller.

In the meantime, you can contribute or track our progress here:

Categories: CNCF Projects, Kubernetes

New Conversion from cgroup v1 CPU Shares to v2 CPU Weight

Kubernetes Blog - Fri, 01/30/2026 - 11:00

I'm excited to announce the implementation of an improved conversion formula from cgroup v1 CPU shares to cgroup v2 CPU weight. This enhancement addresses critical issues with CPU priority allocation for Kubernetes workloads when running on systems with cgroup v2.

Background

Kubernetes was originally designed with cgroup v1 in mind, where CPU shares were defined simply by assigning the container's CPU requests in millicpu form.

For example, a container requesting 1 CPU (1024m) would get (cpu.shares = 1024).

After a while, cgroup v1 started being replaced by its successor, cgroup v2. In cgroup v2, the concept of CPU shares (which ranges from 2 to 262144, or from 2¹ to 2¹⁸) was replaced with CPU weight (which ranges from [1, 10000], or 10⁰ to 10⁴).

With the transition to cgroup v2, KEP-2254 introduced a conversion formula to map cgroup v1 CPU shares to cgroup v2 CPU weight. The conversion formula was defined as: cpu.weight = (1 + ((cpu.shares - 2) * 9999) / 262142)

This formula linearly maps values from [2¹, 2¹⁸] to [10⁰, 10⁴].

Linear conversion formula

While this approach is simple, the linear mapping imposes a few significant problems and impacts both performance and configuration granularity.

Problems with previous conversion formula

The current conversion formula creates two major issues:

1. Reduced priority against non-Kubernetes workloads

In cgroup v1, the default value for CPU shares is 1024, meaning a container requesting 1 CPU has equal priority with system processes that live outside of Kubernetes' scope. However, in cgroup v2, the default CPU weight is 100, but the current formula converts 1 CPU (1024m) to only ≈39 weight - less than 40% of the default.

Example:

  • Container requesting 1 CPU (1024m)
  • cgroup v1: cpu.shares = 1024 (equal to default)
  • cgroup v2 (current): cpu.weight = 39 (much lower than default 100)

This means that after moving to cgroup v2, Kubernetes (or OCI) workloads would de-facto reduce their CPU priority against non-Kubernetes processes. The problem can be severe for setups with many system daemons that run outside of Kubernetes' scope and expect Kubernetes workloads to have priority, especially in situations of resource starvation.

2. Unmanageable granularity

The current formula produces very low values for small CPU requests, limiting the ability to create sub-cgroups within containers for fine-grained resource distribution (which will possibly be much easier moving forward, see KEP #5474 for more info).

Example:

  • Container requesting 100m CPU
  • cgroup v1: cpu.shares = 102
  • cgroup v2 (current): cpu.weight = 4 (too low for sub-cgroup configuration)

With cgroup v1, requesting 100m CPU which led to 102 CPU shares was manageable in the sense that sub-cgroups could have been created inside the main container, assigning fine-grained CPU priorities for different groups of processes. With cgroup v2 however, having 4 shares is very hard to distribute between sub-cgroups since it's not granular enough.

With plans to allow writable cgroups for unprivileged containers, this becomes even more relevant.

New conversion formula

Description

The new formula is more complicated, but does a much better job mapping between cgroup v1 CPU shares and cgroup v2 CPU weight:

$$cpu.weight = \lceil 10^{(L^{2}/612 + 125L/612 - 7/34)} \rceil, \text{ where: } L = \log_2(cpu.shares)$$

The idea is that this is a quadratic function to cross the following values:

  • (2, 1): The minimum values for both ranges.
  • (1024, 100): The default values for both ranges.
  • (262144, 10000): The maximum values for both ranges.

Visually, the new function looks as follows:

2025-10-25-new-cgroup-v1-to-v2-conversion-formula-new-conversion.png

And if you zoom in to the important part:

2025-10-25-new-cgroup-v1-to-v2-conversion-formula-new-conversion-zoom.png

The new formula is "close to linear", yet it is carefully designed to map the ranges in a clever way so the three important points above would cross.

How it solves the problems

  1. Better priority alignment:

    • A container requesting 1 CPU (1024m) will now get a cpu.weight = 102. This value is close to cgroup v2's default 100. This restores the intended priority relationship between Kubernetes workloads and system processes.
  2. Improved granularity:

    • A container requesting 100m CPU will get cpu.weight = 17, (see here). Enables better fine-grained resource distribution within containers.

Adoption and integration

This change was implemented at the OCI layer. In other words, this is not implemented in Kubernetes itself; therefore the adoption of the new conversion formula depends solely on the OCI runtime adoption.

For example:

  • runc: The new formula is enabled from version 1.3.2.
  • crun: The new formula is enabled from version 1.23.

Impact on existing deployments

Important: Some consumers may be affected if they assume the older linear conversion formula. Applications or monitoring tools that directly calculate expected CPU weight values based on the previous formula may need updates to account for the new quadratic conversion. This is particularly relevant for:

  • Custom resource management tools that predict CPU weight values.
  • Monitoring systems that validate or expect specific weight values.
  • Applications that programmatically set or verify CPU weight values.

The Kubernetes project recommends testing the new conversion formula in non-production environments before upgrading OCI runtimes to ensure compatibility with existing tooling.

Where can I learn more?

For those interested in this enhancement:

How do I get involved?

For those interested in getting involved with Kubernetes node-level features, join the Kubernetes Node Special Interest Group. We always welcome new contributors and diverse perspectives on resource management challenges.

Categories: CNCF Projects, Kubernetes

Ingress NGINX: Statement from the Kubernetes Steering and Security Response Committees

Kubernetes Blog - Wed, 01/28/2026 - 19:00

In March 2026, Kubernetes will retire Ingress NGINX, a piece of critical infrastructure for about half of cloud native environments. The retirement of Ingress NGINX was announced for March 2026, after years of public warnings that the project was in dire need of contributors and maintainers. There will be no more releases for bug fixes, security patches, or any updates of any kind after the project is retired. This cannot be ignored, brushed off, or left until the last minute to address. We cannot overstate the severity of this situation or the importance of beginning migration to alternatives like Gateway API or one of the many third-party Ingress controllers immediately.

To be abundantly clear: choosing to remain with Ingress NGINX after its retirement leaves you and your users vulnerable to attack. None of the available alternatives are direct drop-in replacements. This will require planning and engineering time. Half of you will be affected. You have two months left to prepare.

Existing deployments will continue to work, so unless you proactively check, you may not know you are affected until you are compromised. In most cases, you can check to find out whether or not you rely on Ingress NGINX by running kubectl get pods --all-namespaces --selector app.kubernetes.io/name=ingress-nginx with cluster administrator permissions.

Despite its broad appeal and widespread use by companies of all sizes, and repeated calls for help from the maintainers, the Ingress NGINX project never received the contributors it so desperately needed. According to internal Datadog research, about 50% of cloud native environments currently rely on this tool, and yet for the last several years, it has been maintained solely by one or two people working in their free time. Without sufficient staffing to maintain the tool to a standard both ourselves and our users would consider secure, the responsible choice is to wind it down and refocus efforts on modern alternatives like Gateway API.

We did not make this decision lightly; as inconvenient as it is now, doing so is necessary for the safety of all users and the ecosystem as a whole. Unfortunately, the flexibility Ingress NGINX was designed with, that was once a boon, has become a burden that cannot be resolved. With the technical debt that has piled up, and fundamental design decisions that exacerbate security flaws, it is no longer reasonable or even possible to continue maintaining the tool even if resources did materialize.

We issue this statement together to reinforce the scale of this change and the potential for serious risk to a significant percentage of Kubernetes users if this issue is ignored. It is imperative that you check your clusters now. If you are reliant on Ingress NGINX, you must begin planning for migration.

Thank you,

Kubernetes Steering Committee

Kubernetes Security Response Committee

Categories: CNCF Projects, Kubernetes

Experimenting with Gateway API using kind

Kubernetes Blog - Tue, 01/27/2026 - 19:00

This document will guide you through setting up a local experimental environment with Gateway API on kind. This setup is designed for learning and testing. It helps you understand Gateway API concepts without production complexity.

Caution:

This is an experimentation learning setup, and should not be used for production. The components used on this document are not suited for production usage. Once you're ready to deploy Gateway API in a production environment, select an implementation that suits your needs.

Overview

In this guide, you will:

  • Set up a local Kubernetes cluster using kind (Kubernetes in Docker)
  • Deploy cloud-provider-kind, which provides both LoadBalancer Services and a Gateway API controller
  • Create a Gateway and HTTPRoute to route traffic to a demo application
  • Test your Gateway API configuration locally

This setup is ideal for learning, development, and experimentation with Gateway API concepts.

Prerequisites

Before you begin, ensure you have the following installed on your local machine:

  • Docker - Required to run kind and cloud-provider-kind
  • kubectl - The Kubernetes command-line tool
  • kind - Kubernetes in Docker
  • curl - Required to test the routes

Create a kind cluster

Create a new kind cluster by running:

kind create cluster

This will create a single-node Kubernetes cluster running in a Docker container.

Install cloud-provider-kind

Next, you need cloud-provider-kind, which provides two key components for this setup:

  • A LoadBalancer controller that assigns addresses to LoadBalancer-type Services
  • A Gateway API controller that implements the Gateway API specification

It also automatically installs the Gateway API Custom Resource Definitions (CRDs) in your cluster.

Run cloud-provider-kind as a Docker container on the same host where you created the kind cluster:

VERSION="$(basename $(curl -s -L -o /dev/null -w '%{url_effective}' https://github.com/kubernetes-sigs/cloud-provider-kind/releases/latest))"
docker run -d --name cloud-provider-kind --rm --network host -v /var/run/docker.sock:/var/run/docker.sock registry.k8s.io/cloud-provider-kind/cloud-controller-manager:${VERSION}

Note: On some systems, you may need elevated privileges to access the Docker socket.

Verify that cloud-provider-kind is running:

docker ps --filter name=cloud-provider-kind

You should see the container listed and in a running state. You can also check the logs:

docker logs cloud-provider-kind

Experimenting with Gateway API

Now that your cluster is set up, you can start experimenting with Gateway API resources.

cloud-provider-kind automatically provisions a GatewayClass called cloud-provider-kind. You'll use this class to create your Gateway.

It is worth noticing that while kind is not a cloud provider, the project is named as cloud-provider-kind as it provides features that simulate a cloud-enabled environment.

Deploy a Gateway

The following manifest will:

  • Create a new namespace called gateway-infra
  • Deploy a Gateway that listens on port 80
  • Accept HTTPRoutes with hostnames matching the *.exampledomain.example pattern
  • Allow routes from any namespace to attach to the Gateway. Note: In real clusters, prefer Same or Selector values on the allowedRoutes namespace selector field to limit attachments.

Apply the following manifest:

---
apiVersion: v1
kind: Namespace
metadata:
 name: gateway-infra
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
 name: gateway
 namespace: gateway-infra
spec:
 gatewayClassName: cloud-provider-kind
 listeners:
 - name: default
 hostname: "*.exampledomain.example"
 port: 80
 protocol: HTTP
 allowedRoutes:
 namespaces:
 from: All

Then verify that your Gateway is properly programmed and has an address assigned:

kubectl get gateway -n gateway-infra gateway

Expected output:

NAME CLASS ADDRESS PROGRAMMED AGE
gateway cloud-provider-kind 172.18.0.3 True 5m6s

The PROGRAMMED column should show True, and the ADDRESS field should contain an IP address.

Deploy a demo application

Next, deploy a simple echo application that will help you test your Gateway configuration. This application:

  • Listens on port 3000
  • Echoes back request details including path, headers, and environment variables
  • Runs in a namespace called demo

Apply the following manifest:

apiVersion: v1
kind: Namespace
metadata:
 name: demo
---
apiVersion: v1
kind: Service
metadata:
 labels:
 app.kubernetes.io/name: echo
 name: echo
 namespace: demo
spec:
 ports:
 - name: http
 port: 3000
 protocol: TCP
 targetPort: 3000
 selector:
 app.kubernetes.io/name: echo
 type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
 labels:
 app.kubernetes.io/name: echo
 name: echo
 namespace: demo
spec:
 selector:
 matchLabels:
 app.kubernetes.io/name: echo
 template:
 metadata:
 labels:
 app.kubernetes.io/name: echo
 spec:
 containers:
 - env:
 - name: POD_NAME
 valueFrom:
 fieldRef:
 apiVersion: v1
 fieldPath: metadata.name
 - name: NAMESPACE
 valueFrom:
 fieldRef:
 apiVersion: v1
 fieldPath: metadata.namespace
 image: registry.k8s.io/gateway-api/echo-basic:v20251204-v1.4.1
 name: echo-basic

Create an HTTPRoute

Now create an HTTPRoute to route traffic from your Gateway to the echo application. This HTTPRoute will:

  • Respond to requests for the hostname some.exampledomain.example
  • Route traffic to the echo application
  • Attach to the Gateway in the gateway-infra namespace

Apply the following manifest:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
 name: echo
 namespace: demo
spec:
 parentRefs:
 - name: gateway
 namespace: gateway-infra
 hostnames: ["some.exampledomain.example"]
 rules:
 - matches:
 - path:
 type: PathPrefix
 value: /
 backendRefs:
 - name: echo
 port: 3000

Test your route

The final step is to test your route using curl. You'll make a request to the Gateway's IP address with the hostname some.exampledomain.example. The command below is for POSIX shell only, and may need to be adjusted for your environment:

GW_ADDR=$(kubectl get gateway -n gateway-infra gateway -o jsonpath='{.status.addresses[0].value}')
curl --resolve some.exampledomain.example:80:${GW_ADDR} http://some.exampledomain.example

You should receive a JSON response similar to this:

{
 "path": "/",
 "host": "some.exampledomain.example",
 "method": "GET",
 "proto": "HTTP/1.1",
 "headers": {
 "Accept": [
 "*/*"
 ],
 "User-Agent": [
 "curl/8.15.0"
 ]
 },
 "namespace": "demo",
 "ingress": "",
 "service": "",
 "pod": "echo-dc48d7cf8-vs2df"
}

If you see this response, congratulations! Your Gateway API setup is working correctly.

Troubleshooting

If something isn't working as expected, you can troubleshoot by checking the status of your resources.

Check the Gateway status

First, inspect your Gateway resource:

kubectl get gateway -n gateway-infra gateway -o yaml

Look at the status section for conditions. Your Gateway should have:

  • Accepted: True - The Gateway was accepted by the controller
  • Programmed: True - The Gateway was successfully configured
  • .status.addresses populated with an IP address

Check the HTTPRoute status

Next, inspect your HTTPRoute:

kubectl get httproute -n demo echo -o yaml

Check the status.parents section for conditions. Common issues include:

  • ResolvedRefs set to False with reason BackendNotFound; this means that the backend Service doesn't exist or has the wrong name
  • Accepted set to False; this means that the route couldn't attach to the Gateway (check namespace permissions or hostname matching)

Example error when a backend is not found:

status:
 parents:
 - conditions:
 - lastTransitionTime: "2026-01-19T17:13:35Z"
 message: backend not found
 observedGeneration: 2
 reason: BackendNotFound
 status: "False"
 type: ResolvedRefs
 controllerName: kind.sigs.k8s.io/gateway-controller

Check controller logs

If the resource statuses don't reveal the issue, check the cloud-provider-kind logs:

docker logs -f cloud-provider-kind

This will show detailed logs from both the LoadBalancer and Gateway API controllers.

Cleanup

When you're finished with your experiments, you can clean up the resources:

Remove Kubernetes resources

Delete the namespaces (this will remove all resources within them):

kubectl delete namespace gateway-infra
kubectl delete namespace demo

Stop cloud-provider-kind

Stop and remove the cloud-provider-kind container:

docker stop cloud-provider-kind

Because the container was started with the --rm flag, it will be automatically removed when stopped.

Delete the kind cluster

Finally, delete the kind cluster:

kind delete cluster

Next steps

Now that you've experimented with Gateway API locally, you're ready to explore production-ready implementations:

  • Production Deployments: Review the Gateway API implementations to find a controller that matches your production requirements
  • Learn More: Explore the Gateway API documentation to learn about advanced features like TLS, traffic splitting, and header manipulation
  • Advanced Routing: Experiment with path-based routing, header matching, request mirroring and other features following Gateway API user guides

A final word of caution

This kind setup is for development and learning only. Always use a production-grade Gateway API implementation for real workloads.

Categories: CNCF Projects, Kubernetes

Cluster API v1.12: Introducing In-place Updates and Chained Upgrades

Kubernetes Blog - Tue, 01/27/2026 - 11:00

Cluster API brings declarative management to Kubernetes cluster lifecycle, allowing users and platform teams to define the desired state of clusters and rely on controllers to continuously reconcile toward it.

Similar to how you can use StatefulSets or Deployments in Kubernetes to manage a group of Pods, in Cluster API you can use KubeadmControlPlane to manage a set of control plane Machines, or you can use MachineDeployments to manage a group of worker Nodes.

The Cluster API v1.12.0 release expands what is possible in Cluster API, reducing friction in common lifecycle operations by introducing in-place updates and chained upgrades.

Emphasis on simplicity and usability

With v1.12.0, the Cluster API project demonstrates once again that this community is capable of delivering a great amount of innovation, while at the same time minimizing impact for Cluster API users.

What does this mean in practice?

Users simply have to change the Cluster or the Machine spec (just as with previous Cluster API releases), and Cluster API will automatically trigger in-place updates or chained upgrades when possible and advisable.

In-place Updates

Like Kubernetes does for Pods in Deployments, when the Machine spec changes also Cluster API performs rollouts by creating a new Machine and deleting the old one.

This approach, inspired by the principle of immutable infrastructure, has a set of considerable advantages:

  • It is simple to explain, predictable, consistent and easy to reason about with users and engineers.
  • It is simple to implement, because it relies only on two core primitives, create and delete.
  • Implementation does not depend on Machine-specific choices, like OS, bootstrap mechanism etc.

As a result, Machine rollouts drastically reduce the number of variables to be considered when managing the lifecycle of a host server that is hosting Nodes.

However, while advantages of immutability are not under discussion, both Kubernetes and Cluster API are undergoing a similar journey, introducing changes that allow users to minimize workload disruption whenever possible.

Over time, also Cluster API has introduced several improvements to immutable rollouts, including:

The new in-place update feature in Cluster API is the next step in this journey.

With the v1.12.0 release, Cluster API introduces support for update extensions allowing users to make changes on existing machines in-place, without deleting and re-creating the Machines.

Both KubeadmControlPlane and MachineDeployments support in-place updates based on the new update extension, and this means that the boundary of what is possible in Cluster API is now changed in a significant way.

How do in-place updates work?

The simplest way to explain it is that once the user triggers an update by changing the desired state of Machines, then Cluster API chooses the best tool to achieve the desired state.

The news is that now Cluster API can choose between immutable rollouts and in-place update extensions to perform required changes.

In-place updates in Cluster API

Importantly, this is not immutable rollouts vs in-place updates; Cluster API considers both valid options and selects the most appropriate mechanism for a given change.

From the perspective of the Cluster API maintainers, in-place updates are most useful for making changes that don't otherwise require a node drain or pod restart; for example: changing user credentials for the Machine. On the other hand, when the workload will be disrupted anyway, just do a rollout.

Nevertheless, Cluster API remains true to its extensible nature, and everyone can create their own update extension and decide when and how to use in-place updates by trading in some of the benefits of immutable rollouts.

For a deep dive into this feature, make sure to attend the session In-place Updates with Cluster API: The Sweet Spot Between Immutable and Mutable Infrastructure at KubeCon EU in Amsterdam!

Chained Upgrades

ClusterClass and managed topologies in Cluster API jointly provided a powerful and effective framework that acts as a building block for many platforms offering Kubernetes-as-a-Service.

Now with v1.12.0 this feature is making another important step forward, by allowing users to upgrade by more than one Kubernetes minor version in a single operation, commonly referred to as a chained upgrade.

This allows users to declare a target Kubernetes version and let Cluster API safely orchestrate the required intermediate steps, rather than manually managing each minor upgrade.

The simplest way to explain how chained upgrades work, is that once the user triggers an update by changing the desired version for a Cluster, Cluster API computes an upgrade plan, and then starts executing it. Rather than (for example) update the Cluster to v1.33.0 and then v1.34.0 and then v1.35.0, checking on progress at each step, a chained upgrade lets you go directly to v1.35.0.

Executing an upgrade plan means upgrading control plane and worker machines in a strictly controlled order, repeating this process as many times as needed to reach the desired state. The Cluster API is now capable of managing this for you.

Cluster API takes care of optimizing and minimizing the upgrade steps for worker machines, and in fact worker machines will skip upgrades to intermediate Kubernetes minor releases whenever allowed by the Kubernetes version skew policies.

Chained upgrades in Cluster API

Also in this case extensibility is at the core of this feature, and upgrade plan runtime extensions can be used to influence how the upgrade plan is computed; similarly, lifecycle hooks can be used to automate other tasks that must be performed during an upgrade, e.g. upgrading an addon after the control plane update completed.

From our perspective, chained upgrades are most useful for users that struggle to keep up with Kubernetes minor releases, and e.g. they want to upgrade only once per year and then upgrade by three versions (n-3 → n). But be warned: the fact that you can now easily upgrade by more than one minor version is not an excuse to not patch your cluster frequently!

Release team

I would like to thank all the contributors, the maintainers, and all the engineers that volunteered for the release team.

The reliability and predictability of Cluster API releases, which is one of the most appreciated features from our users, is only possible with the support, commitment, and hard work of its community.

Kudos to the entire Cluster API community for the v1.12.0 release and all the great releases delivered in 2025! ​​ If you are interested in getting involved, learn about Cluster API contributing guidelines.

What’s next?

If you read the Cluster API manifesto, you can see how the Cluster API subproject claims the right to remain unfinished, recognizing the need to continuously evolve, improve, and adapt to the changing needs of Cluster API’s users and the broader Cloud Native ecosystem.

As Kubernetes itself continues to evolve, the Cluster API subproject will keep advancing alongside it, focusing on safer upgrades, reduced disruption, and stronger building blocks for platforms managing Kubernetes at scale.

Innovation remains at the heart of Cluster API, stay tuned for an exciting 2026!

Useful links:

Categories: CNCF Projects, Kubernetes

Pages

Subscribe to articles.innovatingtomorrow.net aggregator - CNCF Projects