- Discuss advanced Kubernetes concepts: DaemonSets, StatefulSets, Helm, etc.
With Annotations we can attach arbitrary non-identifying metadata to an objects in a key-value format. Unlike Labels, annotations are not used to identify and select objects, they are displayed while describing an object and used to:
- Store build/release IDs, PR numbers, git branch, etc.
- Phone/pager numbers of people responsible, or directory entries specifying where such information can be found.
- Pointers to logging, monitoring, analytics, audit repositories, debugging tools, etc.
- Ingress controller information.
- Deployment state and revision information.
Quota and Limits Management
To ensure fair usage, where many users share a given Kubernetes cluster administrators can use the ResourceQuota API resource, which provides constraints that limit aggregate resource consumption per Namespace.
Compute Resource Quota enables limiting the total sum of compute resources (CPU, memory, etc.) that can be requested in a given namespace.
Storage Resource Quota lets us limit the total sum of storage resources that can be requested.
Object Count Quota allows us to restric the number of objects of a given type (pods, ConfigMaps, PersistentVolumeClaims, ReplicationControllers, Services, Secrets etc.)
An additional resource that helps limit resources allocation to pods and containers in a namespace, is the LimitRange, used in conjunction with the ResourceQuota API resource.
While it is fairly easy to scale a few Kubernetes objects, this may not be a practical solution for a production-ready cluster where hundreds or thousands of objects are deployed. Autoscaling can be implemented in a Kubernetes cluster via controllers which periodically adjust the number of running objects based on single, multiple or custom metrics. Following Autoscalers can be implemented individually or combined for a more robust auto scaling solution:
- Horizontal Pod Autoscaler (HPA)
HPA is an algorithm-based controller API resource which automatically adjusts the number of replicas in a ReplicaSet, Deployment or Replication Controller based on CPU utilization.
- Vertical Pod Autoscaler (VPA)
VPA automatically sets Container resource requirements (CPU and memory) in a Pod and dynamically adjusts them in runtime, based on historical utilization data, current resource availability and real-time events.
- Cluster Autoscaler
Cluster Autoscaler automatically re-sizes the Kubernetes cluster when there are insufficient resources available for new Pods expecting to be scheduled or when there are underutilized nodes in the cluster.
Jobs and CronJobs
A Job creates one or more Pods to perform a given task. The job will make sure that the given task is completed successful. Job configuration options include:
- parallelism - to set the number of pods allowed to run in parallel;
- completions - to set the number of expected completions;
- activeDeadlineSeconds - to set the duration of the Job;
- backoffLimit - to set the number of retries before Job is marked as failed;
- ttlSecondsAfterFinished - to delay the clean up of the finished Jobs.
We can also perform Jobs at scheduled times/dates with CronJobs, where a new Job object is created about once per each execution cycle. The CronJob configuration options include:
- startingDeadlineSeconds - to set the deadline to start a Job if scheduled time was missed;
- concurrencyPolicy - to allow or forbid concurrent Jobs or to replace old Jobs with new ones.
A DaemonSet is the object that allows us to collect monitoring data from all nodes, or to run a storage daemon on all nodes, because we will need a specific type of Pod running on all nodes at all times. It is a critical controller API resource for multi-node Kubernetes clusters. The kube-proxy for example is an agent running as a Pod on every single node in the cluster managed by a DaemonSet.
The DaemonSet is automatically created on a Pod whenever a node is added to the cluster. Although it ensures an automated process, the DaemonSet's Pods are placed on nodes by the cluster's default Scheduler.
When the node dies or it is removed from the cluster, the respective Pods are garbage collected. If a DaemonSet is deleted, all Pods it created are deleted as well.
The StatefulSet controller is used for stateful applications which require a unique identity, such as name, network identifications, or strict ordering. For example, MySQL cluster, etcd cluster.
The StatefulSet controller provides identity and guaranteed ordering of deployment and scaling to Pods. Similar to Deployments, StatefulSets use ReplicaSets as intermediary Pod controllers and support rolling updates and rollbacks.
In Kubernetes, a resource is an API endpoint which stores a collection of API objects. For example, a Pod resource contains all the Pod objects.
Custom resources are dynamic in nature, they can appear and disappear in an already running cluster at any time. There are two ways to add custom resources.
- Custom Resource Definitions (CRDs)
This is the easiest way to add custom resources and it does not require any programming knowledge. However, building the custom controller would require some programming.
- API Aggregation
For more fine-grained control, we can write API Aggregators. They are subordinate API servers which sit behind the primary API server. The primary API server acts as a proxy for all incoming API requests - it handles the ones based on its capabilities and proxies over the other requests meant for the subordinate API servers.
With Kubernetes Cluster Federation we can manage multiple Kubernetes clusters from a single control plane. We can sync resources across the federated clusters and have cross-cluster discovery. This allows us to perform Deployments across regions, access them using a global DNS record, and achieve High Availability.
Although still an Alpha feature, the Federation is very useful when we want to build a hybrid solution, in which we can have one cluster running inside our private datacenter and another one in the public cloud, allowing us to avoid provider lock-in. We can also assign weights for each cluster in the Federation, to distribute the load based on custom rules.
Security Contexts and Pod Security Policies
At times we need to define specific privileges and access control settings for Pods and Containers. Security Contexts allow us to set Discretionary Access Control for object access permissions, privileged running, capabilities, security labels, etc. However, their effect is limited to the individual Pods and Containers where such context configuration settings are incorporated in the spec section.
In order to apply security settings to multiple Pods and Containers cluster-wide, we can define Pod Security Policies. They allow more fine-grained security settings to control the usage of the host namespace, host networking and ports, file system groups, usage of volume types, enforce Container user and group ID, root privilege escalation, etc.
Kubernetes was designed to allow all Pods to communicate freely, without restrictions, with all other Pods in cluster Namespaces. In time it became clear that it was not an ideal design, and mechanisms needed to be put in place in order to restrict communication between certain Pods and applications in the cluster Namespace. Network Policies are sets of rules which define how Pods are allowed to talk to other Pods and resources inside and outside the cluster. Pods not covered by any Network Policy will continue to receive unrestricted traffic from any endpoint.
Network Policies are very similar to typical Firewalls. They are designed to protect mostly assets located inside the Firewall but can restrict outgoing traffic as well based on sets of rules and policies.
Monitoring and Logging
In Kubernetes, we have to collect resource usage data by Pods, Services, nodes, etc., to understand the overall resource consumption and to make decisions for scaling a given application. Two popular Kubernetes monitoring solutions are the Kubernetes Metrics Server and Prometheus.
- Metrics Server
Metrics Server is a cluster-wide aggregator of resource usage data - a relatively new feature in Kubernetes.
Prometheus, now part of CNCF (Cloud Native Computing Foundation), can also be used to scrape the resource usage from different Kubernetes components and objects. Using its client libraries, we can also instrument the code of our application.
Another important aspect for troubleshooting and debugging is Logging, in which we collect the logs from different components of a given system. In Kubernetes, we can collect logs from different cluster components, objects, nodes, etc. Unfortunately, however, Kubernetes does not provide cluster-wide logging by default, therefore third party tools are required to centralize and aggregate cluster logs. A popular method to collect logs is using Elasticsearch together with fluentd with custom configuration as an agent on the nodes. fluentd is an open source data collector, which is also part of CNCF.
To deploy a complex application, we use a large number of Kubernetes manifests to define API resources such as Deployments, Services, PersistentVolumes, PersistentVolumeClaims, Ingress, or ServiceAccounts. It can become counter productive to deploy them one by one. We can bundle all those manifests after templatizing them into a well-defined format, along with other metadata. Such a bundle is referred to as Chart. These Charts can then be served via repositories, such as those that we have for rpm and deb packages.
Helm is a package manager (analogous to yum and apt for Linux) for Kubernetes, which can install/update/delete those Charts in the Kubernetes cluster.
Helm is a CLI client that may run side-by-side with kubectl on our workstation, that also uses kubeconfig to securely communicate with the Kubernetes API server.
Service Mesh is a third party solution to the Kubernetes native application connectivity and exposure achieved with Services paired with Ingress Controllers. Service Mesh tools are gaining popularity especially with larger organizations managing larger, dynamic Kubernetes clusters. These third party solutions introduce features such as service discovery, multi-cloud routing, and traffic telemetry.
A Service Mesh is an implementation that relies on a proxy component part of the Data Plane, which is then managed through a Control Plane. The Control Plane runs agents responsible for the service discovery, telemetry, load balancing, network policy, and gateway. The Data Plane proxy component is typically injected into Pods, and it is responsible for handling all Pod-to-Pod communication, while maintaining a constant communication with the Control Plane of the Service Mesh.
Wow... Many many Topics are yet to learn about the CNFC and Kubernetes in General. Lots of work has been done in this area. Many different use cases are covered. There is still a big question mark in my head about who should use this complexity and where to start if you want to move from a legacy setup into the cloud and Kubernetes world.
I don't think small or even medium businesses should use Kubernetes. The complexity to get it running a 'little bit right' seems overwhelming until you get a benefit out of it. It might as well be a burden along the way.
On the other Hand, the features we are given with this technology shows me why Google developed so successfully in this area and their initial Borg setup.
For big companies and in the globalized world, Kubernetes will enable wide ranges of interconnected applications which are easy to use for the end-user without any downtime and at low cost.
I hope to continue this Series with the Advanced Topics and get a more deep dive into it. Kubernetes is thrown around everywhere at the moment and I curious how the actual usage will play out.
Will we have many small raspi Kubernetes Cluster running at home (with the help of amazing package distribution with Helm?) or will it only be available to big Companies who can pay for expensive Kubernetes architects and infrastructure.
I recommend to do this Tutorial at edx.org. Although I would have liked to do the Knowledge Checks at the end of each chapter, but since I am not a fan of this certificate business I do not want to spend money for it.
This is it for now friends, next will be a small Chapter about the Community. For now we are done with the Introduction to Kubenetes. Thanks for reading.