One common question I see on Mastodon and Reddit is "I've inherited a cluster, how do I safely upgrade it". It's surprising that this still isn't a better understood process given the widespread adoption of k8s, but I've had to take over legacy clusters a few times and figured I would write up some of the tips and tricks I've found over the years to make the process easier.
A very common theme to these questions is "the version of Kubernetes is very old, what do I do". Often this question is asked with shame, but don't feel bad. K8s is better at the long-term maintenance story as compared to a few years ago, but is still a massive amount of work to keep upgraded and patched. Organizations start to fall behind almost immediately and teams are hesitant to touch a working cluster to run the upgrades.
NOTE: A lot of this doesn't apply if you are using hosted Kubernetes. In that case, the upgrade process is documented through the provider and is quite a bit less complicated.
How often do I need to upgrade Kubernetes?
This is something people new to Kubernetes seem to miss a lot, so I figured I would touch on it. Unlike a lot of legacy infrastructure projects, k8s moves very quickly in terms of versions. Upgrading can't be treated like switching to a new Linux distro LTS release, you need to plan to do it all the time.
To be fair to the Kubernetes team they've done a lot to help make this process less horrible. They have a support policy of N-2, meaning that the 3 most recent minor versions receive security and bug fixes. So you have time to get a cluster stood up and start the process of planning upgrades, but it needs to be in your initial cluster design document. You cannot wait until you are almost EOL to start thinking "how are we going to upgrade". Every release gets patched for 14 months, which seems like a lot but chances are you aren't going to be installing the absolute latest release.
So the answer to "how often do you need to be rolling out upgrades to Kubernetes" is often. They are targeting 3 releases a year, down from the previous 4 releases a year. You can read the projects release goals here. However in order to vet k8s releases for your org, you'll likely need to manage several different versions at the same time in different environments. I typically try to let a minor version "bake" for at least 2 weeks in a dev environment and same for stage/sandbox whatever you call the next step. Prod version upgrades should ideally have a month of good data behind them suggesting the org won't run into problems.
My staggered layout
- Dev cluster should be as close to bleeding edge as possible. A lot of this has to do with establishing SLAs for the dev environment, but the internal communication should look something like "we upgrade dev often during such and such a time and rely on it to surface early problems". My experience is you'll often hit some sort of serious issue almost immediately when you try to do this, which is good. You have time to fix it and know the maximum version you can safely upgrade to as of the day of testing.
- Staging is typically a minor release behind dev. "Doesn't this mean you can get into a situation where you have incompatible YAMLs?" It can but it is common practice at this point to use per-environment YAMLs. Typically folks are much more cost-aware in dev environments and so some of the resource requests/limits are going to change. If you are looking to implement per-environment configuration check out Kustomize.
- Production I try to keep as close to staging as possible. I want to keep my developers lives as easy as possible, so I don't want to split the versions endlessly. My experience with Kubernetes patch releases has been they're pretty conservative with changes and I rarely encounter problems. My release cadence for patches on the same minor version is two weeks in staging and then out to production.
- IMPORTANT. Don't upgrade the minor version until it hits patch .2 AT LEAST. What does this mean?
Right now the latest version of Kubernetes is 1.26.0. I don't consider this release ready for a dev release until it hits 1.26.2. Then I start the timer on rolling from dev -> stage -> production. By the time I get the dev upgrade done and roll to staging, we're likely at the .3 release (depending on the time of year).
That's too slow. Maybe, but I've been burned quite a few times in the past by jumping too early. It's nearly impossible for the k8s team to possibly account for every use-case and guard against every regression and by the time we hit .2, there tends to be wide enough testing that most issues have been discovered. A lot of people wait until .5, which is very slow (but also the safest path).
In practice this workflow looks like this:
- Put in the calendar when releases reach EOL which can be found here.
- Keep track of the upcoming releases and put them in the calendar as well. You can see that whole list in their repo here.
- You also need to do this with patch releases, which typically come out monthly.
- If you prefer to keep track of this in RSS, good news! If you add .atom to the end of the release URL, you can add it to a reader. Example: https://github.com/kubernetes/kubernetes/releases.atom. This makes it pretty easy to keep a list of all releases. You can also just subscribe in GitHub but I find the RSS method to be a bit easier (plus super simple to script, which I'll publish later).
- As new releases come out, roll latest to dev once it hits .2. I typically do this as a new cluster, leaving the old cluster there in case of serious problems. Then I'll cut over deployments to new cluster and monitor for issues. In case of massive problems, switch back to old cluster and start the documentation process for what went wrong.
- When I bump the dev environment, I then circle around and bump the stage environment to one minor release below that. I don't typically do a new cluster for stage (although you certainly can). There's a lot of debate in the k8s community over "should you upgrade existing vs make new". I do it for dev because I would rather upgrade often with fewer checks and have the option to fall back.
- Finally we bump prod. This I rarely will make a new cluster. This is a matter of personal choice and there are good arguments for starting fresh often, but I like to maintain the history in etcd and I find with proper planning a rolling upgrade is safe.
This feels like a giant pain in the ass.
I know. Thankfully cloud providers tend to maintain their own versions which buy you a lot more time, which is typically how people are going to be using it. But I know a lot of people like to run their own clusters end to end or just need to for various reasons. It is however a pain to do this all the time.
Is there an LTS version?
So there was a Kubernetes working group set up to discuss this and their conclusion was it didn't make sense to do. I don't agree with this assessment but it has been discussed.
My dream for Kubernetes would be to add a 2 year LTS version and say "at the end of two years there isn't a path to upgrade". I make a new cluster with the LTS version, push new patches as they come out and then at the end of two years know I need to make a new cluster with the new LTS version. Maybe the community comes up with some happy path to upgrade, but logistically it would be easier to plan a new cluster every 2 years vs a somewhat constant pace of pushing out and testing upgrades.
How do I upgrade Kubernetes?
- See if you can upgrade safely against API paths. I use Pluto. This will check to see if you are calling deprecated or removed API paths in your configuration or helm charts. Run Pluto against local files with:
pluto detect-files -d
. You can also check Helm with:pluto detect-helm -owide
. Adding all of this to CI is also pretty trivial and something I recommend for people managing many clusters.
2. Check your Helm releases for upgrades. Since typically things like the CNI and other dependencies like CoreDNS are installed with Helm, this is often the fastest way to make sure you are running the latest version (check patch notes to ensure they support the version you are targeting). I use Nova for this.
3. Get a snapshot of etcd. You'll want to make sure you have a copy of the data in your production cluster in the case of a loss of all master nodes. You should be doing this anyway.
3. Start the upgrade process. The steps to do this are outlined here.
If you are using managed Kubernetes
This process is much easier. Follow 1 + 2, set a pod disruption budget to allow for node upgrades and then follow the upgrade steps of your managed provider.
I messed up and waited too long, what do I do?
Don't feel bad, it happens ALL the time. Kubernetes is often set up by a team that is passionate about it, then that team is disbanded and maintenance becomes a secondary concern. Folks who inherit working clusters are (understandably) hesitant to break something that is working.
With k8s you need to go from minor -> minor in order, not jumping releases. So you need to basically (slowly) bump versions as you go. If you don't want to do that, your other option is to make a new cluster and migrate to it. I find for solo operators or small teams the upgrade path is typically easier but more time consuming.
The big things you need to anticipate are as follows:
- Ingress. You need to really understand how traffic is coming into the cluster and through what systems.
- Service mesh. Are you using one, what does it do and what version is it set at? Istio can be a BEAR to upgrade, so if you can switch to Linkerd you'll likely be much happier in the long term. However understanding what controls access to what namespaces and pods is critical to a happy upgrade.
- CSI drivers. Do you have them, do they need to be upgraded, what are they doing?
- CNI. Which one are you using, is it still supported, what is involved in upgrading it.
- Certificates. By default they expire after a year. You get fresh ones with every upgrade, but you can also trigger a manual refresh whenever with
kubeadm certs renew
. If you are running an old cluster PLEASE check the expiration dates of your client certificates with:kubeadm certs check-expiration
now. - Do you have stateful deployments? Are they storing something, where are they storing it and how do you manage them? This would be databases, redis, message queues, applications that hold state. These are often the hardest to move or interact with during an upgrade. You can review the options for moving those here. The biggest thing is to set the pod disruption budget so that there is some minimum available during the upgrade process as shown here.
- Are you upgrading etcd? Etcd supports restoring from snapshots that are taken from an etcd process of the major.minor version, so be aware if you are going to be jumping more than a patch. Restoring might not be an option.
Otherwise follow the steps above along with the official guide and you should be ok. The good news is once you bite the bullet and do it once up to a current version, maintenance is easier. The bad news is the initial EOL -> Supported path is soul-sucking and incredibly nerve-racking. I'm sorry.
I'm running a version older than 1.21 (January 2023)
So you need to do all the steps shown above to check that you can upgrade, but my guiding rule is if the version is more than 2 EOL versions ago, it's often easier to make a new cluster. You CAN still upgrade, but typically this means nodes have been running for a long time and are likely due for OS upgrades anyway. You'll likely have a more positive experience standing up a new cluster and slowly migrating over.
You'll start with fresh certificates, helm charts, node OS versions and everything else. Switching over at the load balancer level shouldn't be too bad and it can be a good opportunity to review permissions and access controls to ensure you are following the best procedures.
I hate that advice
I know. It's not my favorite thing to tell people. I'm sorry. I don't make the rules.
Note on Node OS choices
A common trend I will see in organizations is to select whatever Linux distro they use for VMs as their Node OS. Debian, Ubuntu, Rocky, etc. I don't recommend this. You shouldn't think of Nodes as VMs that you SSH into on a regular basis and do things in. They're just platforms to run k8s on. I've had a lot of success with Flatcar Linux here. Upgrading the nodes is as easy as rebooting, you can easily define things like SSH with a nice configuration system shown here.
With Node OS I would much rather get security updates more quickly and know that I have to reboot the node on a regular basis as opposed to keeping track of traditional package upgrades and the EOL for different linux distros, then track whether reboots are required. Often folks will combine Flatcar Linux with Rancher Kubernetes Engine for a super simple and reliable k8s standup process. You can see more about that here. This is a GREAT option if you are making a new cluster and want to make your life as easy as possible in the future. Check out those docs here.
If you are going to use a traditional OS, check out kured. This allows you to monitor the "reboot-required" at /var/run/reboot-required
and schedule automatic cordon, draining, uncordon of the node. It also ensures only one node is touched at a time. This is something almost everyone forgets to do with kubernetes, which is maintain the Node.
Conclusion
I hope this was helpful. The process of keeping Kubernetes upgraded is less terrible the more often you do it, but the key things are to try and get as much time in your environment baking each minor release. If you stay on a regular schedule, the process of upgrading clusters is pretty painless and idiot-proof as long as you do some checking.
If you are reading this and think "I really want to run my own cluster but this seems like a giant nightmare" I strongly recommend checking out Rancher Kubernetes Engine with Flatcar Linux. It's tooling designed to be idiot-proof and can be easily run by a single operator or a pair. If you want to stick with kubeadm
it is doable, but requires more work.
Stuck? Think I missed something obvious? Hit me up here: https://c.im/@matdevdug