Best practices and lessons learnt from running Kyverno in production kubernetes

amitagarwal
3 min readFeb 12, 2022

If you have decided or looking around to implement Kyverno in your kubernetes cluster environment, you have come to the right place.

What is Kyverno?

In short it is an admission controller that applies policies to kubernetes resources, either validating(To verify best practices but not to change anything.) or mutating(Another word of changing existing resources or generating new resources).

More Info here: https://kyverno.io/docs/introduction/

Why is Kyverno?

In other words, what are its use cases and what problems will it solve. It can solve many use cases like enforcing or just checking for best practices, generating resources based on a specific criteria. Some of the policies are listed here which might give you a good idea what it can do.

Implementation details

Coming to the actual purpose of this article, Kyverno can be a very powerful tool to have in your kubernetes cluster but it can be quite risky as well. As you might be aware its an admission controller, so essentially all the API requests can go through Kyverno and if it is not configured correctly, it can bring down the entire cluster.

Best Practices to start with:

  1. failurePolicy set to Ignore. By default, any policy created is set to Fail meaning if Kyverno WebServer is down, all the requests forwarded to it will fail. This setting is per policy. Reference
  2. validationFailureAction set to audit, although this is the default, if you want actually to enforce some best practices and reject requests which do not follow them, you may want to set it to enforce. Remember, enforcing can result in your API requests being rejected, so be careful with it. Reference
  3. With Kyverno 1.6, one good thing is that webhook configurations are controlled by the policy themselves, meaning the configurations are dynamic and are created by Kyverno pods based on policy definitions. Keep your policies limited to the resources and operations that you want your policy to apply to, because the same will reflect in webhook configurations. So keep the resources to a minimum and operations to a minimum.
  4. Run Kyverno in High Availability Mode. Set ReplicaCount to at least 3.
  5. When running Kyverno in HA, its good to keep at least 2 pods running at all the times. Set PodDisruptionBudget minAvailable to 2.
  6. Its good to keep Kyverno pods running on nodes in different availability zones. Setting here. Replace kubernetes.io/hostname with kubernetes.io/zone and weight to 100. This will prevent your pods to schedule on nodes which are in the same AZ.
  7. And obviously monitor kyverno. Here is a good article on how to set it up. With all the configurations above, it might still be good to monitor the performance and API requests rejections.

That will be all, do let me know if you follow any other best practices in the comments below. Thanks to the awesome team at Nirmata for creating such a wonderful product.

--

--

amitagarwal

Site Reliability and Devops. Kubernetes and Cloud Native technologies.