Istio canary upgrades

thumbnail

I’ve been looking into upgrading Istio using canary upgrades. Canary upgrades let me test a new version of Istio by migrating part of the workloads to the new version and observing the impact of the change. If anything goes wrong, I can roll back to the old version. Old and new Istio versions run side-by-side until it’s verified that pods work fine. When everything is okay, I can safely remove the old version. All examples I was looking at are somewhat confusing. I wanted a down-to-the-bottom list of things one has to do to execute a canary upgrade properly.

Here’s my take on it.

§istioctl manifest generate

The official Istio documentation describes a few possible Istio installation methods. These are:

  • istioctl install
  • using Helm charts
  • kubectl apply with manifests generated with istio manifest generate
  • deprecated istioctl operator method

I prefer the istio manifest generate method because I can feed it with the IstioOperator manifest, so that generated manifests get fused with the IstioOperator resource.

§first installation, the right way

Okay, say that I have a cluster where Istio was installed like this:

  1. IstioOperator for Istio 1.13.5:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
cat > /tmp/1-13-5.yaml <<EOF
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  name: 1-13-5
  namespace: istio-system
spec:
  profile: "demo"
  revision: 1-13-5
  tag: 1.13.5
EOF

The revision property is the important one. It uses dashes instead of dots as separators because the revision property doesn’t support dots.

  1. Manifests generated with istioctl manifest generate:
1
istioctl manifest generate --filename /tmp/1-13-5.yaml > /tmp/istio-1-13-5.yaml

The trick is to use istioctl version of Istio for which I generate manifests, which isn’t a problem because the relevant istioctl version can be downloaded from Istio GitHub releases[1]. Wrapping istioctl in a thin Docker image isn’t a problem.

  1. Generated manifests contain CRDs and other Istio resources. To install Istio:
1
kubectl apply -f /tmp/istio-1-13-5.yaml

§default Istio revision

The thing with using revisions is that, when not using explicit revisions, Istio assumes that whatever version is installed, is the so-called default revision. The default revision has a specific semantic meaning. After Istio documentation[2]:

  • Injects sidecars for the istio-injection=enabled namespace selector, the sidecar.istio.io/inject=true object selector, and the istio.io/rev=default selectors.
  • Validates Istio resources.
  • Steals the leader lock from non-default revisions and performs singleton mesh responsibilities (such as updating resource statuses).

To turn a revision into a default revision, I use the following command:

1
istioctl tag set default --revision 1-13-5

If I don’t want the default revision, as in, if I prefer always using explicit revisions, instead of using istio-injection=enabled namespace labels, I should use istio.io/rev=1-13-5 label. Don’t use both labels; use istio.io/rev=1-13-5 only. It doesn’t work with both, there will be no sidecar injected when both labels are in use.

§the cluster: a recap

Okay, so here’s the original cluster:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
cat > /tmp/cluster.yaml <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: istio-cluster
nodes:
- role: control-plane
  kubeadmConfigPatches:
  - |
    kind: InitConfiguration
    nodeRegistration:
      kubeletExtraArgs:
        node-labels: "ingress-ready=true"
  extraPortMappings:
  - containerPort: 80
    hostPort: 80
    protocol: TCP
  - containerPort: 443
    hostPort: 443
    protocol: TCP
- role: worker
- role: worker
EOF

kind create cluster --config=/tmp/cluster.yaml

cat > /tmp/1-13-5.yaml <<EOF
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  name: 1-13-5
  namespace: istio-system
spec:
  profile: "demo"
  revision: 1-13-5
  tag: 1.13.5
EOF

kubectl create ns istio-system
# using istioctl 1.13.5:
istioctl manifest generate --filename /tmp/1-13-5.yaml > /tmp/istio-1-13-5.yaml
kubectl apply -f /tmp/istio-1-13-5.yaml
kubectl wait --for=condition=available \
    -n istio-system \
    --timeout="180s" \
    deployment/istiod-1-13-5

§run some workloads

1
2
3
kubectl create ns bookinfo
kubectl label ns bookinfo istio.io/rev=1-13-5
kubectl apply -n bookinfo -f https://raw.githubusercontent.com/istio/istio/1.13.5/samples/bookinfo/platform/kube/bookinfo.yaml

§Istio canary upgrade

The fun part, it’s very easy:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
cat > /tmp/1-14-1.yaml <<EOF
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  name: 1-14-1
  namespace: istio-system
spec:
  profile: "demo"
  revision: 1-14-1
  tag: 1.14.1
EOF

# using istioctl 1.14.1:
istioctl manifest generate --filename /tmp/1-14-1.yaml > /tmp/istio-1-14-1.yaml
kubectl apply -f /tmp/istio-1-14-1.yaml
kubectl wait --for=condition=available \
    -n istio-system \
    --timeout="180s" \
    deployment/istiod-1-14-1

At this point I have istiod-1-13-5 and istiod-1-14-1 running.

1
kubectl get deployments -n istio-system
NAME                   READY   UP-TO-DATE   AVAILABLE   AGE
istio-egressgateway    1/1     1            1           2m11s
istio-ingressgateway   1/1     1            1           2m11s
istiod-1-13-5          1/1     1            1           2m11s
istiod-1-14-1          1/1     1            1           75s

Move the bookinfo workload to the new version:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
kubectl label ns bookinfo istio.io/rev=1-14-1 --overwrite
kubectl rollout restart deployment -n bookinfo details-v1
sleep 20
kubectl rollout restart deployment -n bookinfo ratings-v1
sleep 20
kubectl rollout restart deployment -n bookinfo productpage-v1
sleep 20
kubectl rollout restart deployment -n bookinfo reviews-v1
sleep 20
kubectl rollout restart deployment -n bookinfo reviews-v2
sleep 20
kubectl rollout restart deployment -n bookinfo reviews-v3

And that’s it. Job done. If everything works, I can uninstall Istio 1.13.5 from the cluster. If things don’t work, I can move the workload back to the previous version.

Finally, I can set the default version to the new one:

1
istioctl tag set default --revision 1-14-1

so that I can use istio-injection=enabled namespace label, or the sidecar.istio.io/inject=true selector.

§what happens when tag set default is executed

When a tag is created with istioctl tag set default --revision command, istioctl creates a MutatingWebhookConfiguration in the istio-system namespace. Without the default, we have:

1
kubectl get mutatingwebhookconfiguration -n istio-system
NAME                            WEBHOOKS   AGE
istio-sidecar-injector-1-13-5   2          94s
istio-sidecar-injector-1-14-1   2          38s

With the default tag, we have:

1
kubectl get mutatingwebhookconfiguration -n istio-system
NAME                            WEBHOOKS   AGE
istio-revision-tag-default      4          2m9s
istio-sidecar-injector-1-13-5   2          6m1s
istio-sidecar-injector-1-14-1   2          5m5s

The tag is set from this place[3] in Istio code, and generated here[4].

§notes on CDRs

Istio CRDs are, at minimum, one version backward compatible. They are supposed to work fine when upgrading from 1.x to 1.x+1. When upgrading across multiple versions, the documentation suggests doing it version-by-version. For example, upgrading from 1.11.x to 1.14.x should be done as:

1.11.x -> 1.12.x -> 1.13.x -> 1.14.x

Skipping versions can be done but is not recommended.