Kubernetes: All about the container orchestration platform

7 min readMay 31, 2021

Kubernetes is an open source container orchestration platform created by Google. Find out how it is used, how it works, and how it differs from Docker.

Containers are an operating system virtualization method for launching an application and its dependencies through a set of processes isolated from the rest of the system. This method helps ensure the rapid and stable deployment of applications in any IT environment.

In full swing for several years, containers have changed the way we develop , deploy and maintain software. Due to their lightness and flexibility, they have allowed the emergence of new forms of application architectures, consisting in constituting applications within separate containers and then deploying these containers on a cluster of virtual or physical machines. However, this close news created the need for new “container orchestration” tools to automate the deployment, management, networking, scaling and availability of container-based applications. This is the role of this platform.

It is an Open Source project created by Google in 2015. It automates the deployment and management of multi-container applications at scale . It is a system for executing and coordinating containerized applications on a cluster of machines. It is a platform designed to fully manage the lifecycle of containerized applications and services using predictability, scalability and high availability methods.

Mainly compatible with Docker, Kubernetes can work with any container system that conforms to the Open Container Initiative standard in terms of image formats and runtime environments. Due to its open source nature, it can also be used freely by anyone, anywhere.

How does it work?

Kubernetes architectures are based on several concepts and abstractions : some already existed before, others are specific to it. The main abstraction that Kubernetes relies on is the cluster, which is the group of machines running it and the containers it manages.

A Kubernetes cluster must have a master : the system that commands and controls all the other machines in the cluster. A highly available cluster replicates the functions of the master on different machines, but only one master at a time runs the controller-manager and the scheduler.

Each cluster contains Kubernetes nodes . They can be physical or virtual machines. Nodes, on the other hand, run pods: the most basic Kubernetes objects that can be created or managed. Each pod represents a single instance of an application or process running on it, and consists of one or more containers. All containers are launched and replicated as a group in the pod. Thus, the user can focus on the application rather than the containers.

The controller is another abstraction for managing how pods are deployed, created, or destroyed. Depending on the different applications to be managed, there are different pods. A further abstraction is the service , which ensures the persistence of applications even if the pods are destroyed. The service describes how a group of pods can be accessed through the network.

There are other key components of Kubernetes. The scheduler distributes the workloads among the nodes to ensure a balance between resources, and to ensure that the deployments correspond to the needs of the applications. The controller manager ensures that the state of the system (applications, workloads, etc.) corresponds to the desired state defined in the Etcd configuration parameters.

What is it for?

The main benefit of Kubernetes is to allow businesses to focus on how they want applications to work, rather than specific implementation details. Thanks to the abstractions used to manage groups of containers, the behaviors they need are dissociated from the components that provide them.

Kubernetes thus makes it possible to automate and simplify several tasks. First of all, we can cite the deployment of multi-container applications . Many applications reside in several containers (database, web front end, cache server, etc.), and microservices are also developed on this model. Usually the different services are linked by API and web protocols.

This approach has long term benefits, but requires a lot of work in the short term . Kubernetes helps reduce the effort required. The user tells Kubernetes how to compose an application from a set of containers, and the platform takes care of the deployment and ensures the synchronization of the components between them.

This tool also simplifies the scaling of containerized applications . Indeed, applications need to be scaled to keep up with demand and optimize the use of resources. Kubernetes can automate this scaling. The platform also allows the continuous deployment of new versions of applications, eliminating maintenance time. Its mechanisms allow you to update container images and even go back in the event of a problem.

Kubernetes and its also allow container networking , service discovery and storage. Finally, not being linked to a specific environment or cloud technology, Kubernetes can be launched in any environment : public cloud, private stacks, physical or virtual hardware… it is even possible to mix environments.

Kubernetes alternatives and competitors

There are several alternatives to Kubernetes. We can cite Docker Compose , very suitable for staging, but not really for production.

Another famous tool is Nomad. This allows cluster management and planning, but not configuration management and monitoring.

For its part, Titus is the open source orchestration platform developed by Netflix. At the moment, few people use it in production.

Another project often cited as a competitor to Kubernetes is Mesos . This is an Apache project, originally started by developers at Twitter. This tool offers container orchestration services, but goes further.

It is designed as a Cloud operating system allowing the coordination of containerized and non-containerized components. Many platforms are compatible, starting with Kubernetes.

Kubernetes vs Docker Swarm: what’s the difference?

We very often compare Kubernetes with the Docker container storage platform , and more precisely with Docker Swarm, the native clustering solution for Docker containters. These two tools indeed offer functionalities for creating and managing virtual containers. However, these two systems have many differences.

First of all, Docker is proving to be easier to use than Kubernetes . One of the faults often criticized in Kubernetes is indeed its complexity. For example, Kubernetes takes a long time to install and configure, and requires some planning because nodes must be defined before you begin. The procedure is also different for each operating system.

For its part, Docker Swarm uses the Docker CLI to run all portions of its program. You only need to learn to master this set of tools to be able to create environments and configurations. It is also not necessary to map the clusters before starting.

In addition, Kubernetes can be run on top of Docker but requires knowledge of the characteristics of their respective CLIs to be able to access data through the API. You need to know the Docker CLI to navigate within the framework, and the Kubernetes kubectl CLI to run programs.

In comparison, the use of Docker Swarm is similar to that of other Docker tools like Compose. We use the same Docker CLI, and it is even possible to launch new containers with a simple command. Due to its speed, versatility and ease of use, Docker therefore has a certain advantage over Kubernetes in terms of usability.

The two platforms were also distinguished in the past by the number of containers that can be launched, as well as by their size. In this area, Kubernetes had the advantage. However, recent Docker updates have helped close the gap.

Now, the two systems can support a maximum of 1000 clusters and 30,000 containers . However, a test conducted by Docker in March 2016 found that Docker can launch the same number of containers as Kubernetes five times faster. However, once the containers are launched, Kubernetes retains an advantage in terms of responsiveness and flexibility.

Either way, nothing prevents you from using both Kubernetes and Docker Swarm. For example, Docker and Kubernetes can be used jointly to coordinate the programming and execution of Docker containers on Kubelets. The Docker engine is responsible for running the container image, while service discovery, task balancing, and networking are handled by Kubernetes. These two tools are therefore very suitable for the development of modern Cloud architecture despite their differences.

Kubernetes for Data Science

Kubernetes has become an essential tool for software developers and system operators, allowing them to deploy and manage various applications in Linux containers. However, this solution is also very useful for Data Science.

Many Data Scientists are indeed faced with the same issues as software engineers. They need portable and repeatable environments, repeat experiments, monitor metrics in production, and need elasticity.

The pipeline Machine Learning are otherwise similar to the continuous integration development pipelines. Several coordinated steps must work simultaneously and reproducibly in order to be able to extract features, process data, and train machine learning models.

Additionally, microservice architectures simplify debugging machine learning models within the pipeline and facilitate cooperation between data scientists and other team members. Finally, declarative configurations illustrating connections between services simplify model building and repeatable learning pipelines.

Data Scientists therefore face the same challenges as application developers, but also face other challenges related to the way they work and test ML models . Many use interactive notebooks developed by Project Jupyter.

However, Kubernetes offers many advantages and allows to develop higher level tools. In addition to allowing the development of machine learning-based techniques to solve business problems, Kubernetes also offers opportunities to implement these techniques in production.

Kubeflow: Kubernetes for Machine Learning

The open source Kuberflow project simplifies machine learning workflow deployments on Kubernetes, by making them portable and extensible. This is a comprehensive toolkit for Machine Learning on Kubernetes.

The ability of Kubernetes to perform independent and configurable and extended steps through specific frameworks and libraries . Kubeflow can be run very easily on a workstation, on-site training rig or in the cloud.

It brings together all the necessary open source tools and frameworks : Jupyter Notebooks, training frameworks like PyTorch, TensorFlow, Chainer, MPI and MXNet, Katib for the configuration of hyperparameters, an IAM system and various serving tools like KFServing, Seldon Core, BentoML, Nvidia Triton and Tensorflow Serving.

Originally published at https://newshubtunisia.com.