Kubernetes - introduction

1 Working progress

Try to explain clearly what the Kubernetes elements translate to and repeat it often to clarify this vocabulary as quickly as possible. Explain everything: so start from a simple (and stupid) solution and add to it. Do not drown: keep things structured so things are not confusing.

2 Our showcase application: [FAIRDOM Seek]

Seek is an extensive Data Management platform mostly aimed at Life Sciences but that can be used in any domain. It is a good example for going from bar metal to full cloud setup as (1) it is already available as container and bare-metal installation, with a complete Docker compose which is still easy to understand,(2) it generally relies on several containers (4 with the default docker compose), several volumes, workers (i.e. side process that run regularly for doing batch works), a comprehensive search using Solr, and object storage and (3) it can be run with the unique Seek container (using a sqlite database within the container) so can go from really simple (for discovery or tutorials) to rather large (with load balancing, external DB, backups) which will allow us to go from a simple and stupid (in a Kubernetes context) case to full setups with most of the cases suitable for production.

2.1 Seek and his components

2.2 The default Docker Compose

2.3 What could benefit from a cloud setup

3 A quick overview of Kubernetes

3.1 Good free instructory resources

Kubernetes Crash Course for Absolute Beginners

3.2 Terminology & main elements

A first glance on Kubernetes can be very confusing. A big part of this confusion is due to a lot of “moving” parts with confusing names.

A full list of terms can be found on the official documentation.

The maim terms/elements you should know are:

  • Cluster:
  • Node: one worker machine in Kubernetes, like a server. Can be a physical machine or a virtual machine.
  • Pod: the smallest object in Kubernetes, that will run one or several containers. The containers can be run by docker or other container engines.
  • Service: a fixed network endpoint to a container in Kubernetes. It allows to connect to an application in a Pod, without having this connection defined into the pod. Kubernetes often has one extra layer of access (compared to Docker Compose for instance) allowing a full decoupling of all elements. That makes it easier to change the configuration later on (automatically due to the need or from an user update).
  • Label: labels are used for Kubernetes to find the different elements.
  • Kubectl: the command line tool to communicate with the Kubernetes control center. With it you can create the object, update them, list them…
  • Manifest: the JSON or YAML file that define a Kubernetes object. Objects are all elements of Kubernetes: Deployments, Services, ReplicaSet, DaemonSet, …

And generally, the applications are deployed as:

  • ReplicatSet: ask for a set numbers of Pods. Kubernetes will try to have the set number of pods running, and will restore this which might become unhealthy. The pods can be on any node and it is not guaranteed that Kubernetes will be able to get the right number of pods (if there isn’t enough resources typically). It is possible to influence on which node the pods will run using affinities, see Section 3.12

  • Deployment:

  • DaemonSet:

  • StatefulSet:

  • Job:

  • ConfigMap:

  • Secret:

3.3 Everything decoupled

How Kubernetes makes everything through API and labels Why there is so many extra layers compared to docker compose -> full decoupling.

3.4 The various objects for your application

Deployment, Job, ReplicaSet, DaemonSet, StatefulSet

While the Pod is the smallest element in Kubernetes, it is directly started only for testing. If you define a pod or create it using kubectl, it will only run once and will not be restarted if something fails. As such it is like docker run, if started using kubectl, or docker compose, if started using a manifest, and offer no benefice, just the added complexity of the setup.

A ReplicaSet ask for a set number of pods and Kubernetes will ensure that there are running. If one pod is not healthy anymore, Kubernetes will remove it and create a new one.

3.5 Updating an application

Applications are still running in container, so updating consists in updating the image. If deployment are used, the update is managed by kubernetes: it will follow an update strategy to run the new containers while stopping and removing the old ones.

3.6 Where do you store the data

Volumes, …

3.7 Configuration values

ConfigMap and Secrets

Regular Values can be stored in a ConfigMap, while values that should be hidden are store in Secrets. The main idea of Secrets is to keep your confidential values out of the regular configuration, in a secure place. Secret values are encoded in Base64 for preventing issues with special characters but it does not offer any added security (it can be decoded with any Base64 decoding tool).

3.8 Namespace

3.9 Network access

3.10 LoadBalancing

3.11 Several Deployments

kustomize

helm

3.12 Affinities