I’ve been looking into upgrading Istio using canary upgrades. Canary upgrades let me test a new version of Istio by migrating part of the workloads to the new version and observing the impact of the change. If anything goes wrong, I can roll back to the old version. Old and new Istio versions run side-by-side until it’s verified that pods work fine. When everything is okay, I can safely remove the old version. All examples I was looking at are somewhat confusing. I wanted a down-to-the-bottom list of things one has to do to execute a canary upgrade properly.
Here’s my take on it.
§istioctl manifest generate
The official Istio documentation describes a few possible Istio installation methods. These are:
istioctl install
- using Helm charts
kubectl apply
with manifests generated withistio manifest generate
- deprecated
istioctl operator
method
I prefer the istio manifest generate
method because I can feed it with the IstioOperator
manifest, so that generated manifests get fused with the IstioOperator
resource.
§first installation, the right way
Okay, say that I have a cluster where Istio was installed like this:
IstioOperator
for Istio 1.13.5:
|
|
The revision
property is the important one. It uses dashes instead of dots as separators because the revision property doesn’t support dots.
- Manifests generated with
istioctl manifest generate
:
|
|
The trick is to use istioctl
version of Istio for which I generate manifests, which isn’t a problem because the relevant istioctl
version can be downloaded from Istio GitHub releases[1]. Wrapping istioctl
in a thin Docker image isn’t a problem.
- Generated manifests contain CRDs and other Istio resources. To install Istio:
|
|
§default Istio revision
The thing with using revisions is that, when not using explicit revisions, Istio assumes that whatever version is installed, is the so-called default
revision. The default
revision has a specific semantic meaning. After Istio documentation[2]:
- Injects sidecars for the
istio-injection=enabled
namespace selector, thesidecar.istio.io/inject=true
object selector, and theistio.io/rev=default
selectors. - Validates Istio resources.
- Steals the leader lock from non-default revisions and performs singleton mesh responsibilities (such as updating resource statuses).
To turn a revision into a default
revision, I use the following command:
|
|
If I don’t want the default
revision, as in, if I prefer always using explicit revisions, instead of using istio-injection=enabled
namespace labels, I should use istio.io/rev=1-13-5
label. Don’t use both labels; use istio.io/rev=1-13-5
only. It doesn’t work with both, there will be no sidecar injected when both labels are in use.
§the cluster: a recap
Okay, so here’s the original cluster:
|
|
§run some workloads
|
|
§Istio canary upgrade
The fun part, it’s very easy:
|
|
At this point I have istiod-1-13-5
and istiod-1-14-1
running.
|
|
NAME READY UP-TO-DATE AVAILABLE AGE
istio-egressgateway 1/1 1 1 2m11s
istio-ingressgateway 1/1 1 1 2m11s
istiod-1-13-5 1/1 1 1 2m11s
istiod-1-14-1 1/1 1 1 75s
Move the bookinfo
workload to the new version:
|
|
And that’s it. Job done. If everything works, I can uninstall Istio 1.13.5 from the cluster. If things don’t work, I can move the workload back to the previous version.
Finally, I can set the default
version to the new one:
|
|
so that I can use istio-injection=enabled
namespace label, or the sidecar.istio.io/inject=true
selector.
§what happens when tag set default
is executed
When a tag is created with istioctl tag set default --revision
command, istioctl
creates a MutatingWebhookConfiguration
in the istio-system
namespace. Without the default, we have:
|
|
NAME WEBHOOKS AGE
istio-sidecar-injector-1-13-5 2 94s
istio-sidecar-injector-1-14-1 2 38s
With the default tag, we have:
|
|
NAME WEBHOOKS AGE
istio-revision-tag-default 4 2m9s
istio-sidecar-injector-1-13-5 2 6m1s
istio-sidecar-injector-1-14-1 2 5m5s
The tag is set from this place[3] in Istio code, and generated here[4].
§notes on CDRs
Istio CRDs are, at minimum, one version backward compatible. They are supposed to work fine when upgrading from 1.x
to 1.x+1
. When upgrading across multiple versions, the documentation suggests doing it version-by-version. For example, upgrading from 1.11.x
to 1.14.x
should be done as:
1.11.x -> 1.12.x -> 1.13.x -> 1.14.x
Skipping versions can be done but is not recommended.