Introduction
Last month, I had the opportunity to attend KubeCon North America 2025. The conference was packed with insightful sessions, technical deep-dives, and industry trends that are shaping the future of Kubernetes and cloud-native infrastructure. I wanted to share my personal highlights and key takeaways for anyone interested in the evolving landscape of cloud-native technologies.
Key Takeaways
Kubernetes Upgrade Rollbacks
Upgrades remain one of the most critical and challenging aspects of Kubernetes operations. A recurring theme was the push for safer, more automated upgrade processes. As of Kubernetes 1.32, upgrade rollbacks are now possible due to the introduction of the “compatibility version” concept.
- What it does: Allows clusters to upgrade binaries while temporarily emulating the previous version, enabling thorough soak testing before fully switching to the new version.
- Why it matters: This two-step upgrade process dramatically reduces the risk of data corruption and makes rollbacks safer and more predictable.
Reference: https://kubernetes.io/docs/concepts/cluster-administration/compatibility-version/
Declarative Infrastructure and Abstractions
Kubernetes has a steep learning curve which needs to be overcome for engineers to develop components or applications on their platforms. Many companies are significantly investing into their tooling to make it easier for engineers to focus on “what” they need, instead of “how” to get there. Tools like kro can greatly simplify platform component development and deployment, and are reaching wider adoption across the community.
The move towards declarative infrastructure and GitOps also continues to accelerate. ClusterAPI (an existing project) uses CRDs to define Kubernetes clusters declaratively, built as a pluggable system to integrate with any infrastructure provider. ArgoCD adoption is growing, and I saw many vendors and tooling specifically built around managing and working with ArgoCD, showcasing the strength of the ecosystem.
AI in the Cloud-Native World
As expected, AI was a huge focus within the conference. There were so many vendors and talks about how AI can integrate into the Kubernetes ecosystem. There were two major themes I noticed around AI:
- Multi-agent Orchestration: AI workloads are moving into an agentic model, meaning that solutions must be developed to orchestrate agent-to-agent communication. The goal is to make AI agents interoperable, context-aware, and easier to manage within existing cloud-native infrastructure.
- Resource Allocation: This is a new challenge in regards to platforms, and requires more advancement in tooling around this area. Previously the main resources that an application would need to contend with are CPU and memory, but AI workloads introduce the need to manage GPU resources. Dynamic Resource Allocation (DRA), which is now generally available in Kubernetes v1.34, is a huge step towards managing GPU resource allocation and will be key in the future of AI on Kubernetes.
Final Thoughts
KubeCon NA 2025 reinforced that the Kubernetes ecosystem is rapidly evolving, with a strong focus on safety, automation, and developer experience. The themes of safer upgrades, AI orchestration, declarative management, automated security, and resilient storage are shaping the future of cloud-native platforms. For me, this was my first KubeCon so there was a bit of stumbling around at first figuring out which talks to go to (or even which room they were in). Once I hit my stride, though, it was really fun. Getting the opportunity to attend all those sessions, talk to vendors, and network with the community was an incredibly valuable experience.