Summary of kubernetes features and terminologies


Introduction

Kubernetes had won the container orchestration war. Here is a summary of its features. It's an API, command line and UI. It uses etcd to keep its state. Every thing is done via Yaml or JSON (your choice).

General

  • Node: a machine or instance (used to be called "minion")
  • Namespace: a grouping of resources
  • Label: a tag applied to a resource ex. role=frontend
  • Annotations: another form of meta data 

Workloads

  • Container (spec): the building block of deployable service or runnable task using linux containers ex. docker image and params to pass at runtime. Typically created in a Pod see below
  • Pod (spec): one of more containers scheduled to nodes together and thus can share volumes. Most common pods have a single container but there are use cases for more (ex. nginx and php-fpm) . If you replicate a pod to have 3 replicas it would have 3 nginx and 3 php-fpm. If php-fpm created a file in the volume, nginx can see it. Typically created on its own or by controllers like a ReplicationController, Deployment, DaemonSet, StatefulSet (pets),  Job ...
  • Init container: a container to be executed until success before starting the pod workload service. ex. generate assets, download something ...etc.

  • ReplicationController (spec): RC for short. The way to say I want x copies of this pod. this is the old way see Deployment.
  • ReplicaSet (spec): like RC but support selector-based (ex. control number of replicas of all frontend pods that have a specific label). See deployment.
  • Deployment (spec): manage current state and desired state of ReplicaSet or pods in a declarative way. ex. instead of remove replicated pod and create it with the new number of replicas, I can say I want to change this pod from 3 replica to 5 replicas and tell me how many currently do we have? 4 ok!
  • CronJob (spec): just line UNIX crontab. Run something periodically
    • kubectl run hello --schedule="*/1 * * * *" --restart=OnFailure --image=busybox -- /bin/sh -c "date; echo Hello from the Kubernetes cluster"
  • Job (spec): run something once and track its success
  • DaemonSet (spec): like UNIX daemon to be run on a single copy on all or some nodes (using nodeSelector or affinity). example use case are daemons you need to run to monitor or manage the cluster itself, ex. logstash, ceph, gluster, nagios ..etc.
  • StatefulSet (spec): was petset. like deployment but more suitable for stateful workloads like databases.  Has ordinal index. The identity sticks to the Pod, regardless of which node it’s (re)scheduled on.

Networking, Discovery and Load Balancers

  • NetworkPolicy(spec): something like firewall, limit who can connect not just by IP for example by namespace or pod labels
  • Service(spec): a way to access and discover and loadbalance services (layer 4)
    • Env Variables{SVCNAME}_SERVICE_HOST and {SVCNAME}_SERVICE_PORT
    • Service types:
      • Type=NodePort: accessing any node on that port would reach the service even from outside
      • Type=ClusterIP: a virtual IP only accessible from inside cluster
      • Type=LoadBalancer external loadbalancer like EBS
      • Type=ExternalName
    • externalIPs
    • Headless Service: Type=ClusterIP and ClusterIP=none
    • sessionAffinity: random distribution or sticky based on ClientIP
  • Ingress(spec): like service but layer 7 either http or https
  • Endpoints(spec): IPs behind the service
  • DNS:
    • service A records: my-svc.my-namespace.svc.cluster.local and return either a virtual ip or list to all pod ips for headless
    • service SRV record: _my-port-name._my-port-protocol.my-svc.my-namespace.svc.cluster.local
    • pod A record: pod-ip-address.my-namespace.pod.cluster.local

Monitoring

  • Probe(spec): health checks using one of
    • exec(spec): execute a command inside container to check health (ex. "killall -0 myname")
    • httpGet(spec): use http hit
    • tcpSocket(spec): use a tcp/udp port to check health
  • examples are:
    • livenessProbe: the container is running and not stuck
    • readinessProbe: pod is ready to accept connections
    • postStart and preStop(spec): pod's init container, but it's on part of container
  • RAM and CPU resources(concept and spec):
    • request: at scheduling-time (send it with this available capacity)
    • limit: throttled if exceeded
  • Autoscaling(spec and command): increase number of pods based on some metric
  • Security contexts:
  • logging:
  • Heapster influxdb,..etc.

Storage

  • Volumes(spec): of the following volume types
    • emptyDir: attach an initially empty directory from host to pod, can have different mediums like ram or disks
    • hostPath: attach a path from host node to the pod (does not support many features)
    • cloud-provider specific like GCE or EBS or Azure
    • network volumes and filesystems: ceph rbd, cephfs, glusterfs, iscsi, nfs, fc (fibre channel), flocker, 
    • gitRepo: clone a git repo
    • secret: created using kubernetes api
    • persistentVolumeClaim
    • downwardAPI
    • projected
  • PersistentVolume PV(spec) and PersistentVolumeClaim PVC(spec)
  • Access modes:
    • RWO - ReadWriteOnce (only one pod can attach it to write to it)
    • ROX - ReadOnlyMany (many pods can attach to it but only for reading)
    • RWX - ReadWriteMany (many can attach it and write to it as in nfs)

Operations

  • kubectl get nodes|pods|services|...
    • -o to customize output ex. -o json or -o yaml
  • kubectl describe pod my-pod
  • kubectl create
  • kubectl exec
  • kubectl edit: open interactive editor (like vim or nano)
  • kubectl patch: apply non-interactive changes to a resource
  • kubectl rolling-update
  • kubectl drain: evict a node for maintenance 
  • dashboard
  • namespaces, contexts, kubeconfig, user, ...
  • ansible

FAQ

How to deploy a database?

You need to ask your self a handful of questions like
  • can you manage to assign databases to nodes or are they created dynamically ex. if you host wordpress blogs and when someone order a blog you need to create a database dynamically?
  • do you want to assign a host directory to its data directory or you want it to come from some highly available "claim" (ex. a specific rbd or ebs) that can be migrated to another host?
  • How many database daemon you want per node and do you want to assign and manage ports? (because node ports can't be reused by many services)
  • Do you have small number of nodes dedicated to database? can you manage assigning special labels like db=mydb1master to your nodes or that would be too much effort?
If you have small number of database nodes and small number database daemons and hostPath is good enough for you and eviction (drain) is not something you plan to do at least not using kubernetes APIs. In all those cases use DaemonSet.

If you want to deploy many databases on demand (which you can't manage statically or manually assign ports, directories, ..etc.) then use Statefulsets or pets.

Where to learn kubectl commands?



Comments

Popular posts from this blog

Multi-host docker cluster using OVS/VxLAN on CentOS 7

Be aware! Docker is a trap.

Bootstrapping Alpine Linux QCow2 image