Facilitate and Automatize Kubernetes Operations

Project Goal


This project aims to develop a tool for statically validating Kubernetes workloads across different Kubernetes versions. This will simplify and automate cluster upgrades by predicting whether a workload is compatible with a new Kubernetes version, mitigating the risks of in-place upgrades and the costs of parallel infrastructure

Overview


The main issue during upgrade to a newer version of Kubernetes cluster is that we cannot statically determine if our current Kubernetes workloads are going to break in the newer version because of API changes/deprecations. The only way to determine it is to run it against a new Kubernetes cluster. When you have thousands of pods/resources this way to perform operations doesn’t scale. The testing should not follow an empiric strategy but take advantage of a static analysis. This is the first step of a wider idea for statistical analysis of service mesh dependencies. 

In supporting our infrastructure engineers, as well as engineers worldwide working with Kubernetes, we focused on making workload migration between cluster versions less error-prone, faster, more efficient, and less susceptible to human oversight. 

Highlights in 2024


The project related to static analysis of context (kubernetes-diff) is considered completed. A proof of concept was produced that unfortunately couldn’t cover the entire domain of possible cases due to limitations of Kubernetes schemas, which are often incomplete. This inability to achieve 100% case coverage makes it impossible to fully realize the project’s objectives.

However, we worked to raise awareness within the community by contributing with updates to the Kubernetes project documentation. These changes aim to enhance community understanding, assist feature researchers, and explicitly address this limitation in the appropriate context. This is particularly important since such knowledge is typically familiar only to those directly working on the Kubernetes codebase, while average users may lack this insight.

We collaborated with the Kubernetes API Machinery Special Interest Group (sig-api-machinery) on our contribution [1] to the official Kubernetes documentation, consulting them on optimal dissemination strategies to prevent others from encountering the same issues.

Since active development on the Kubernetes tool was stopped, in agreement with Oracle we decided to work in parallel on other projects of interest for both parties.

We spent the first half of the year testing and evaluating the Oracle Database Multilingual Engine (Oracle MLE). Our results showed significant performance improvements—several times faster in some cases—for specific use cases. This suggests developers can benefit from using modern languages like JavaScript for server-side procedural logic.

During the second half of the year, we collaborated with Oracle to enhance Oracle REST Data Services (ORDS) authentication, enabling OpenID Connect support for CERN’s ORDS service and eliminating the need for custom authentication components. This project remains in progress, with further development planned for the near future.


[1] https://github.com/kubernetes/website/pull/49025

Next Steps


Next year, our focus will be on Phase VIII of openlab. This phase will involve further modernization of the CERN ORDS service, leveraging Oracle’s new Kubernetes support. We will also utilize the Oracle Kubernetes Operator to modernize Oracle Database management, simplifying database instance provisioning for developers.

More Information


Project Coordinator: Antonio Nappi

Technical Team: Antonio Nappi, Adrian Karasiński, Artur Wiecek

Collaboration Liaisons: Eric Grancher, Cristobal Pedregal-Martin, Miroslav Potocky, Garret Swart, Aleksandra Wardzinska, John Lathouwers

In partnership with: Oracle