diff --git a/knowledge base/kubernetes/README.md b/knowledge base/kubernetes/README.md
index 615167d..7ca9835 100644
--- a/knowledge base/kubernetes/README.md
+++ b/knowledge base/kubernetes/README.md
@@ -28,6 +28,7 @@ Hosted by the [Cloud Native Computing Foundation][cncf].
1. [Managed Kubernetes Services](#managed-kubernetes-services)
1. [Best practices in cloud environments](#best-practices-in-cloud-environments)
1. [Edge computing](#edge-computing)
+1. [Best practices](#best-practices)
1. [Troubleshooting](#troubleshooting)
1. [Dedicate Nodes to specific workloads](#dedicate-nodes-to-specific-workloads)
1. [Recreate Pods upon ConfigMap's or Secret's content change](#recreate-pods-upon-configmaps-or-secrets-content-change)
@@ -329,6 +330,45 @@ Each node pool should:
If planning to run Kubernetes on a Raspberry Pi, see [k3s] and the [Build your very own self-hosting platform with Raspberry Pi and Kubernetes] series of articles.
+## Best practices
+
+Also see [configuration best practices] and the [production best practices checklist].
+
+- Prefer an **updated** version of Kubernetes.
+ The upstream project maintains release branches for the most recent three minor releases.
+ Kubernetes 1.19 and newer receive approximately 1 year of patch support. Kubernetes 1.18 and older received approximately 9 months of patch support.
+- Prefer **stable** versions of Kubernetes and **multiple nodes** for production clusters.
+- Prefer **consistent** versions of Kubernetes components throughout **all** nodes.
+ Components support [version skew][version skew policy] up to a point, with specific tools placing additional restrictions.
+- Consider keeping **separation of ownership and control** and/or group related resources.
+ [Namespaces].
+- Consider **organizing** cluster and workload resources.
+ [Labels][labels and selectors]; [recommended Labels].
+- Avoid sending traffic to pods which are not ready to manage it.
+ [Readiness probes][Configure Liveness, Readiness and Startup Probes] signal services to not forward requests until the probe verifies its own pod is up. [Liveness probes][configure liveness, readiness and startup probes] ping the pod for a response and check its health; if the check fails, they kill the current pod and launch a new one.
+- Avoid workloads and nodes fail due limited resources being available.
+ Set [resource requests and limits][resource management for pods and containers] to reserve a minimum amount of resources for pods and limit their hogging abilities.
+- Prefer smaller container images.
+- Prioritize critical workloads.
+ Quality of service.
+- Instrument applications to detect and respond to the SIGTERM signal.
+- Avoid using bare pods.
+ Prefer defining them as part of a replica-based resource, like Deployments, StatefulSets, ReplicaSets or DaemonSets.
+- Restrict traffic between objects in the cluster.
+ Network policies.
+- Reduce container privileges.
+- Leverage autoscalers.
+- Pod disruption budgets.
+- Try to use all nodes possible.
+ Affinities, taint and tolerations.
+- Push for automation.
+ GitOps.
+- Apply the principle of least privilege.
+ Role-based access control (RBAC).
+- Continuously audit events and logs regularly, also for control plane components.
+- Protect the cluster's ingress points.
+ Firewalls, web application firewalls, application gateways.
+
## Troubleshooting
### Dedicate Nodes to specific workloads
@@ -477,6 +517,7 @@ Tools:
- [`helm`][helm]
- [`helmfile`][helmfile]
- [`kubeval`][kubeval]
+- `kube-score`
- [`kubectx`+`kubens`][kubectx+kubens] (alternative to [`kubie`][kubie])
- [`kube-ps1`][kube-ps1]
- [`kubie`][kubie] (alternative to [`kubectx`+`kubens`][kubectx+kubens] and [`kube-ps1`][kube-ps1])
@@ -497,24 +538,37 @@ All the references in the [further readings] section, plus the following:
- [Read-only filesystem error]
- [preStop hook doesn't work with env variables]
- [Configure Quality of Service for Pods]
+- [Version skew policy]
+- [Labels and Selectors]
+- [Recommended Labels]
+- [Configure Liveness, Readiness and Startup Probes]
+- [Configuration best practices]
+- [Cloudzero Kubernetes best practices]
[addons]: https://kubernetes.io/docs/concepts/cluster-administration/addons/
[api deprecation policy]: https://kubernetes.io/docs/reference/using-api/deprecation-policy/
[common labels]: https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/
[concepts]: https://kubernetes.io/docs/concepts/
+[configuration best practices]: https://kubernetes.io/docs/concepts/configuration/overview/
[configure a pod to use a configmap]: https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/
[configure a security context for a pod or a container]: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
+[configure liveness, readiness and startup probes]: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
[configure quality of service for pods]: https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/
[container hooks]: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks
[distribute credentials securely using secrets]: https://kubernetes.io/docs/tasks/inject-data-application/distribute-credentials-secure/
[documentation]: https://kubernetes.io/docs/home/
+[labels and selectors]: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
[namespaces]: https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/
[no new privileges design proposal]: https://github.com/kubernetes/design-proposals-archive/blob/main/auth/no-new-privs.md
+[production best practices checklist]: https://learnk8s.io/production-best-practices
+[recommended labels]: https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/
+[resource management for pods and containers]: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
[security context design proposal]: https://github.com/kubernetes/design-proposals-archive/blob/main/auth/security_context.md
[security design proposal]: https://github.com/kubernetes/design-proposals-archive/blob/main/auth/security.md
[set capabilities for a container]: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-capabilities-for-a-container
[using sysctls in a kubernetes cluster]: https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/
+[version skew policy]: https://kubernetes.io/releases/version-skew-policy/
[further readings]: #further-readings
@@ -534,6 +588,7 @@ All the references in the [further readings] section, plus the following:
[best practices for pod security in azure kubernetes service (aks)]: https://learn.microsoft.com/en-us/azure/aks/developer-best-practices-pod-security
[build your very own self-hosting platform with raspberry pi and kubernetes]: https://kauri.io/build-your-very-own-self-hosting-platform-with-raspberry-pi-and-kubernetes/5e1c3fdc1add0d0001dff534/c
+[cloudzero kubernetes best practices]: https://www.cloudzero.com/blog/kubernetes-best-practices
[cncf]: https://www.cncf.io/
[container capabilities in kubernetes]: https://unofficial-kubernetes.readthedocs.io/en/latest/concepts/policy/container-capabilities/
[elasticsearch]: https://github.com/elastic/helm-charts/issues/689