Update 2023.11.10: The original post was long and contained a lot of code, right in your face.
Since I tidied up the accompanying repository, I am making the following changes:
- The default ingress is turned on.
- Removing large code blocks, replacing them with invocation and functional description.
- Splitting the VM installation process into two separate steps: VM create and bootstrap.
Update 2023.11.13: I have updated the Istio installation:
- Use an explicit revision during installation revision, adapt components to use the revision.
- Work documented in the Revisioned Istio pull request.
I am going to connect a VM to the Istio mesh running in the Kubernetes cluster.
I will need:
- A VM.
- A Kubernetes cluster.
- Containers.
This is not a job for KinD because I need a VM. Since I’m on a macOS, it’s Multipass for me today.
I am going to run:
- A k3s cluster on Multipass.
- A VM on Multipass, same network as the k3s cluster.
After setting up the k3s cluster I follow the steps from Istio documentation: Virtual Machine Installation[1].
The outcome is a working Istio mesh with a VM in the mesh supporting bidirectional traffic, and a short dive into network policies.
table of contents
§tools
Besides the standard kubectl
:
multipass
: macOSbrew install multipass
, Linux official instructions[2] or follow instructions for your distribution,git
: to fetch the additional data,yq
: follow an official guide[3],docker
orpodman
if you are on an arm64-based host,istioctl
: instructions further in the article.
§working directory
|
|
This repository brings all tools used further in this post: shell programs and additional artifacts used during the rollout of the VM.
All commands further in this article assume that you remain in the newly created directory.
You can place it wherever you like, name it however you like, but that’s the working directory.
§environment configuration
A few settings. Mainly:
- where’s the data stored
- the temp directory, some artifacts will be downloaded and extracted to disk
- where’s the kubeconfig?
- settings:
- operating system to use
- Istio version to use
- VM resource settings
- VM-related configuration
Settings exist in the run.env file:
|
|
|
|
§setting up the k3s cluster
I am starting a k3s cluster with one control plane and two workers. Traefik is disabled because we use Istio. Also, the default k3s load balancer—Klipper—is on. This program requires multipass.
|
|
Once the cluster is running, the kubeconfig is written to .data/.kubeconfig.
Please be mindful of the rather high resource requirement. Adjust as you see fit.
The cluster can be deleted with:
|
|
and recreated (removed and created again) with:
|
|
§setting up the client
To have your kubectl and and other tools use the correct kubeconfig:
|
|
This will set your KUBECONFIG environment variable and some other various settings.
§verify the cluster
|
|
Something along the lines of:
NAME STATUS ROLES AGE VERSION
k3s-master Ready control-plane,master 10m v1.27.7+k3s1
k3s-worker-1 Ready <none> 9m26s v1.27.7+k3s1
k3s-worker-2 Ready <none> 9m18s v1.27.7+k3s1
§install istioctl
This program downloads istioctl for ISTIO_VERSION and ISTIO_ARCH, and places it in the .bin/ directory.
ISTIO_VERSION
: Istio version, default1.19.3
ISTIO_ARCH
: one of< osx-arm64
,osx-amd64
,linux-armv7
,linux-arm64
,linux-amd64 >
, defaultosx-arm64
|
|
Verify:
|
|
no ready Istio pods in "istio-system"
1.19.3
|
|
.bin//istioctl
Double /
is not a mistake.
§istio installation
This is where I start to follow the Istio documentation1.
This program installs and configures Istio from the IstioOperator resource. Install Istio:
|
|
Verify:
|
|
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istiod-1-19-3 ClusterIP 10.43.255.139 <none> 15010/TCP,15012/TCP,443/TCP,15014/TCP 39s
|
|
TAG REVISION NAMESPACES
default 1-19-3
§eastwest gateway
This program installs the eastwest gateway. This program depends on DEFAULT_PORT_* and EWG_PORT_* environment variables.
|
|
The outcome should be similar to this:
|
|
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istiod-1-19-3 ClusterIP 10.43.255.139 <none> 15010/TCP,15012/TCP,443/TCP,15014/TCP 17m
istio-ingressgateway LoadBalancer 10.43.217.75 192.168.64.60,192.168.64.61,192.168.64.62 15021:31941/TCP,80:30729/TCP,443:32187/TCP 11m
istio-eastwestgateway LoadBalancer 10.43.106.169 192.168.64.60,192.168.64.61,192.168.64.62 15022:31036/TCP,15443:32297/TCP,15013:31263/TCP,15018:32660/TCP 15s
If your eastwest gateway remains in the pending state, ensure that a DEFAULT_PORT is not equal to the corresponding EWG_PORT.
|
|
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istiod-1-19-3 ClusterIP 10.43.255.139 <none> 15010/TCP,15012/TCP,443/TCP,15014/TCP 17m
istio-ingressgateway LoadBalancer 10.43.217.75 192.168.64.60,192.168.64.61,192.168.64.62 15021:31941/TCP,80:30729/TCP,443:32187/TCP 11m
istio-eastwestgateway LoadBalancer 10.43.106.169 <pending> 15021:31036/TCP,15443:32297/TCP,15013:31263/TCP,15018:32660/TCP 15s
In the example above, the problem is 15021:31036/TCP: 15021 is already used by the default ingress gateway.
§create the vm
Before a workload group can be created, we need to have the VM because, in this case, the IP address is needed for the workload group. To create the VM:
|
|
This command supports –delete and –recreate flags. Only the multipass VM is manipulated. Other files remain on disk.
§the workload group
This program finds the VM IP address and uses it to create a WorkloadGroup Kubernetes resource. When ready, the program downloads all files required by the VM to join the mesh (istioctl x workload entry configure). Files are stored in .data/workload-files.
|
|
Verify:
|
|
output similar to:
192.168.64.60 istiod-1-19-3.istio-system.svc
If there are no hosts here, your eastwest gateway is most likely not working correctly.
§workload group ca_addr
The CA_ADDR environment variable exported in the cluster.env file points by default to the Istio TLS port, the 15012.
|
|
CA_ADDR='istiod-1-19-3.istio-system.svc:15012'
The service name is correct but the port isn’t. I hoped that the tool would pick up the port from the ingress gateway service but the help for istioctl x workload entry configure says:
--ingressService string Name of the Service to be used as the ingress gateway,
in the format <service>.<namespace>. If no namespace
is provided, the default istio-system namespace will
be used. (default "istio-eastwestgateway")
Since that’s our gateway name, it obviously doesn’t detect ports. By default install.workload.sh program fixes that by simply appending the CA_ADDR to use to the end of the cluster.env file.
|
|
CA_ADDR='istiod-1-19-3.istio-system.svc:15012'
CA_ADDR=istiod-'1-19-3.istio-system.svc:15013'
§the vm: arm64
bootstrap caveat
For example if you are on an M2 mac, like me… Istio documentation instructs to install Istio sidecar on the VM using a downloaded deb package. The problem is, Multipass runs an arm64 build of Ubuntu and the deb package is available only for the amd64 architecture. Eventually, trying to start Istio on the VM, you’d end up with:
|
|
dpkg: error processing archive istio-sidecar.deb (--install):
package architecture (amd64) does not match system (arm64)
Errors were encountered while processing:
istio-sidecar.deb
As a workaround, I have to replicate the work done in the deb package but I have to source arm64 binaries.
§the deb package
I decompressed it:
|
|
and kept the following:
|
|
100644 blob c18ec3ce73f52fafe05585de91cd4cda2cdf3951 .data/istio-sidecar/lib/systemd/system/istio.service
100755 blob e022fbb08d4375a66276263b70380230e4702dbe .data/istio-sidecar/usr/local/bin/istio-start.sh
100644 blob ab4bbffd39a7462db68312b7049828c7b4c1d673 .data/istio-sidecar/var/lib/istio/envoy/envoy_bootstrap_tmpl.json
100644 blob fc42e5483094378ca0f0b00cd52f81d1827531cb .data/istio-sidecar/var/lib/istio/envoy/sidecar.env
§arm64
binaries
Skip if not on an arm64 host.
Two binaries have to be replaced with their arm64 versions:
/usr/local/bin/envoy
/usr/local/bin/pilot-agent
For me, the easiest way I could come up with was to:
- Download the linux/arm64 Istio proxyv2 Docker image.
- Create a container, don’t start it.
- Copy the files out of the file system.
- Remove the container.
- Reference exported filesystem for required arm64 binaries.
|
|
The default CONTAINER_TOOL depends on your run.env and equals to docker when not set. To use podman:
|
|
§bootstrap the vm
This program deploys all the files required by the VM to join the mesh, moves them to the right places, and configures to VM to handle the Istio sidecar.
|
|
§validate the vm
Get the shell on the VM:
|
|
On the VM ubuntu@vm-istio-external-workload:~$
, regardless of the fact that we set the CA_ADDR, we still have to use the correct value for the PILOT_ADDRESS.
|
|
However, this is also already dealt with in install.workload.sh. We can start the program with:
|
|
2023-11-01T01:27:02.836001Z info Running command: iptables -t nat -D PREROUTING -p tcp -j ISTIO_INBOUND
2023-11-01T01:27:02.837879Z info Running command: iptables -t mangle -D PREROUTING -p tcp -j ISTIO_INBOUND
2023-11-01T01:27:02.839355Z info Running command: iptables -t nat -D OUTPUT -p tcp -j ISTIO_OUTPUT
...
2023-11-01T01:27:04.000569Z error citadelclient Failed to load key pair open etc/certs/cert-chain.pem: no such file or directory
2023-11-01T01:27:04.004712Z info cache generated new workload certificate latency=118.57166ms ttl=23h59m58.995289161s
2023-11-01T01:27:04.004744Z info cache Root cert has changed, start rotating root cert
2023-11-01T01:27:04.004759Z info ads XDS: Incremental Pushing ConnectedEndpoints:2 Version:
2023-11-01T01:27:04.004885Z info cache returned workload certificate from cache ttl=23h59m58.995116686s
2023-11-01T01:27:04.004954Z info cache returned workload trust anchor from cache ttl=23h59m58.995045987s
2023-11-01T01:27:04.005150Z info cache returned workload trust anchor from cache ttl=23h59m58.994850182s
2023-11-01T01:27:04.006124Z info ads SDS: PUSH request for node:vm-istio-external-workload.vmns resources:1 size:4.0kB resource:default
2023-11-01T01:27:04.006176Z info ads SDS: PUSH request for node:vm-istio-external-workload.vmns resources:1 size:1.1kB resource:ROOTCA
2023-11-01T01:27:04.006204Z info cache returned workload trust anchor from cache ttl=23h59m58.99379629s
If your istio-start.sh command doesn’t produce any output after iptables output:
-A OUTPUT -p udp --dport 53 -d 127.0.0.53/32 -j REDIRECT --to-port 15053
COMMIT
2023-11-09T11:21:47.136622Z info Running command: iptables-restore --noflush
(hangs here)
CTRL+C
, exec to the VM, and rerun last command again.
§validate DNS resolution
Open another terminal:
|
|
On the VM ubuntu@vm-istio-external-workload:~$
:
|
|
; <<>> DiG 9.18.12-0ubuntu0.22.04.3-Ubuntu <<>> istiod-1-19-3.istio-system.svc
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27953
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;istiod-1-19-3.istio-system.svc. IN A
;; ANSWER SECTION:
istiod-1-19-3.istio-system.svc. 30 IN A 10.43.26.124
;; Query time: 0 msec
;; SERVER: 127.0.0.53#53(127.0.0.53) (UDP)
;; WHEN: Wed Nov 01 02:31:59 CET 2023
;; MSG SIZE rcvd: 80
The IP address should in the answer section be equal to the cluster IP of the service:
|
|
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istiod-1-19-3 ClusterIP 10.43.26.124 <none> 15010/TCP,15012/TCP,443/TCP,15014/TCP 19m
It’s 10.43.26.124 in both cases, the DNS resolution is working.
§validating communication
Deploy a sample application allowing us to validate the connection from the VM to the mesh.
§create and configure the namespace
|
|
§on amd64
host
|
|
§on arm64
host
|
|
Again, both example images published by Istio do not exist for the linux/arm64 architecture, I build them from my own Dockerfile for linux/arm64. The source code is here[4].
§hello world pods are running
|
|
NAME READY STATUS RESTARTS AGE
helloworld-v1-cff64bf8c-z5nq5 0/2 PodInitializing 0 8s
helloworld-v2-9fdc9f56f-tbmk8 0/2 PodInitializing 0 8s
helloworld-v1-cff64bf8c-z5nq5 1/2 Running 0 20s
helloworld-v2-9fdc9f56f-tbmk8 1/2 Running 0 21s
helloworld-v1-cff64bf8c-z5nq5 2/2 Running 0 21s
helloworld-v2-9fdc9f56f-tbmk8 2/2 Running 0 22s
§checking vm to mesh connectivity
In a shell on a VM ubuntu@vm-istio-external-workload:-$
:
|
|
* Trying 10.43.109.101:5000...
* Connected to helloworld.sample.svc (10.43.109.101) port 5000 (#0)
> GET /hello HTTP/1.1
> Host: helloworld.sample.svc:5000
> User-Agent: curl/7.81.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< server: envoy
< date: Wed, 01 Nov 2023 01:59:55 GMT
< content-type: text/html; charset=utf-8
< content-length: 59
< x-envoy-upstream-service-time: 88
<
Hello version: v2, instance: helloworld-v2-9fdc9f56f-tbmk8
* Connection #0 to host helloworld.sample.svc left intact
The service responded, the VM can reach services in the mesh.
§routing mesh traffic to the vm
Create a service pointing at the workload group. This sets up the route to the VM service for a port. The example uses hardcoded values but it would be a simple job to makes those configurable.
|
|
Find the workload entry, this exists only when Istio sidecar is running in the VM.
§workload entry is unhealthy
|
|
NAME AGE ADDRESS
external-app-192.168.64.64-vm-network 2m33s 192.168.64.64
Check its status, it will be unhealthy:
|
|
|
|
§fix it by starting the workload
The reason why it is unhealthy is because the service on the VM isn’t running. Start a simple HTTP server to fix this, on the VM ubuntu@vm-istio-external-workload:-$
:
|
|
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
127.0.0.6 - - [01/Nov/2023 22:44:25] "GET / HTTP/1.1" 200 -
127.0.0.6 - - [01/Nov/2023 22:44:30] "GET / HTTP/1.1" 200 -
127.0.0.6 - - [01/Nov/2023 22:44:35] "GET / HTTP/1.1" 200 -
...
Almost immediately we see requests arriving. This is the health check. Istio sidecar on the VM logged:
2023-11-01T21:04:50.337302Z info healthcheck failure threshold hit, marking as unhealthy: Get "http://localhost:8000/": dial tcp 127.0.0.6:0->127.0.0.1:8000: connect: connection refused
2023-11-01T21:32:12.943221Z info xdsproxy connected to upstream XDS server: istiod-1-19-3.istio-system.svc:15012
2023-11-01T21:44:25.343463Z info healthcheck success threshold hit, marking as healthy
The status of the workload entry has changed:
|
|
|
|
§verify connectivity with curl
Finally, run an actual command to verify:
|
|
If you don't see a command prompt, try pressing enter.
~ $
Pay attention to the namespace used in the last command above. The curl pod must be launched in an Istio-enabled namespace, and sample already existed.
Execute the following command in that terminal:
|
|
* Trying 10.43.122.27:8000...
* Connected to external-app.vmns.svc (10.43.122.27) port 8000
> GET / HTTP/1.1
> Host: external-app.vmns.svc:8000
> User-Agent: curl/8.4.0
> Accept: */*
>
< HTTP/1.1 200 OK
< server: envoy
< date: Wed, 01 Nov 2023 22:10:20 GMT
< content-type: text/html; charset=utf-8
< content-length: 768
< x-envoy-upstream-service-time: 7
<
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Directory listing for /</title>
</head>
<body>
<h1>Directory listing for /</h1>
<hr>
<ul>
<li><a href=".bash_history">.bash_history</a></li>
<li><a href=".bash_logout">.bash_logout</a></li>
<li><a href=".bashrc">.bashrc</a></li>
<li><a href=".cache/">.cache/</a></li>
<li><a href=".profile">.profile</a></li>
<li><a href=".ssh/">.ssh/</a></li>
<li><a href=".sudo_as_admin_successful">.sudo_as_admin_successful</a></li>
<li><a href="lib/">lib/</a></li>
<li><a href="usr/">usr/</a></li>
<li><a href="var/">var/</a></li>
<li><a href="workload/">workload/</a></li>
</ul>
<hr>
</body>
</html>
* Connection #0 to host external-app.vmns.svc left intact
§enable strict tls
|
|
There are no observable changes.
§network policies
§caveat: cannot select a workload entry in a network policy
Because a network policy selects pods using .spec.podSelector
, and we have no pods—we have a workload entry instead—we are not able to attach network policies to the VM. The following has no effect:
|
|
|
|
in that shell:
|
|
* Trying 10.43.122.27:8000...
* Connected to external-app.vmns.svc (10.43.122.27) port 8000
> GET / HTTP/1.1
> Host: external-app.vmns.svc:8000
> User-Agent: curl/8.4.0
> Accept: */*
>
< HTTP/1.1 200 OK
< server: envoy
...
Cosider future work: is this working when using Istio CNI?
§guarding the vm from the source of traffic namespace
|
|
|
|
in that shell:
|
|
upstream connect error or disconnect/reset before headers. retried and the latest reset reason: remote connection failure, transport failure reason: delayed connect error: 111~ $ ^C
~ $ exit
Session ended, resume using 'kubectl attach vm-response-test -c vm-response-test -i -t' command when the pod is running
pod "vm-response-test" deleted
|
|
|
|
in that shell:
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Directory listing for /</title>
</head>
<body>
<h1>Directory listing for /</h1>
<hr>
<ul>
<li><a href=".bash_history">.bash_history</a></li>
<li><a href=".bash_logout">.bash_logout</a></li>
<li><a href=".bashrc">.bashrc</a></li>
<li><a href=".cache/">.cache/</a></li>
<li><a href=".profile">.profile</a></li>
<li><a href=".ssh/">.ssh/</a></li>
<li><a href=".sudo_as_admin_successful">.sudo_as_admin_successful</a></li>
<li><a href="lib/">lib/</a></li>
<li><a href="usr/">usr/</a></li>
<li><a href="var/">var/</a></li>
<li><a href="workload/">workload/</a></li>
</ul>
<hr>
</body>
</html>
§network boundary for network policies
Current situation:
- Egress from source to the VM can be blocked only on a namespace level.
- On the VM side, network policies aren’t capable selecting workload entries. These only select pods using
.spec.podSelector
.
The natural network boundary is the namespace and explicit deny of egress to the namespace with the VM.
§cleaning up
|
|
§summary
Success, a pod in the mesh can communicate to the VM via the service, VM is in the mesh and can communicate back to the mesh. Istio VM workloads are easy way to automate VM-mesh onboarding.