Bridging the Firecracker network gap

Because docker0 is not the right choice
thumbnail

Today I have looked at creating my own bridge networks for Firecracker VMMs. I already used CNI setups when evaluating the HashiCorp Nomad firecracker task driver[1]. Back then I incorrectly stated that Firecracker depends on certain CNI plugins. It doesn’t, it can take advantage of any CNI setup as long as the tc-redirect-tap is in the chained plugins.

The Nomad task driver had some issues, briefly:

  • every now and then, oddly, the task would never shut the VMM down and the only way to make the VMM gow down was to sudo kill nomad
  • I tried updating the task driver to latest SDK version but I was not able to upgrade the Firecracker dependency past a specific commit, any version after that specific commit makes the VMM come up, the network setup to be there but the VMM is not reachable, really, really weird issue - reported it here

So today, I took a different route.

§firectl

firectl[2] is a command line utility for launching VMMs and a reference implementation of an application built on top of the Firecracker Golang SDK[3]. I have some exposure to the firectl from when I was trying to upgrade the Nomad task driver. The task driver uses parts of firectl code internally.

One thing missing from thefirectl is the option to define the CNI network name to use. Something similar to this snippet from the SDK readme:

1
2
3
4
5
6
7
8
{
  NetworkInterfaces: []firecracker.NetworkInterface{{
    CNIConfiguration: &firecracker.CNIConfiguration{
      NetworkName: "fcnet",
      IfName: "veth0",
    },
  }}
}

The first thing to do was to add support for that. I’ve created a firectl fork and pushed my changes to GitHub, here[4].

My changes:

  1. a new argument to declare the --cni-network to use
  2. add the network interface based on the new argument
  3. added --netns argument required when using the CNI networks

Having built my version of firectl, I’ve declared this CNI conflist (in /firecracker/cni/conf.d/alpine.conflist):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
{
    "name": "alpine",
    "cniVersion": "0.4.0",
    "plugins": [
        {
            "type": "bridge",
            "name": "alpine-bridge",
            "bridge": "alpinebridge0",
            "isDefaultGateway": true,
            "ipMasq": true,
            "hairpinMode": true,
            "ipam": {
                "type": "host-local",
                "subnet": "192.168.127.0/24",
                "resolvConf": "/etc/resolv.conf"
            }
        },
        {
            "type": "firewall"
        },
        {
            "type": "tc-redirect-tap"
        }
    ]
}

and started the VMM via firectl, like this:

1
2
3
4
5
6
7
8
sudo ./firectl \
    --firecracker-binary=/usr/bin/firecracker \
    --kernel=/firecracker/kernels/vmlinux-v5.8 \
    --root-drive=/firecracker/filesystems/alpine-base-root.ext4 \
    --cni-network=alpine \
    --socket-path=/tmp/alpine.sock \
    --ncpus=1 \
    --memory=128

This works line a charm. I can SSH to the VMM via the IP address which can be found with:

1
cat /var/lib/cni/networks/alpine/last_reserved_ip.0

The VMM can reach the outside world, no degradation here.

§host-local IPAM

Reading more about the host-local IPAM, I figured that the issue with IP addresses changing on every start of the VMM is the normal behaviour. Basically, it stores the allocations on disk, under the /var/lib/cni/networks/<network-name> directory. This is, I think, referred to as IP address leakage.

One way to take care of that, is to have a custom operator listening for when the VMM is stopped and removing that IP allocation manually. How, it’s undefined but it should be rather straightforward by using a combination of MMDS and some custom agent.

§The bridge

The other thing to note is, I’m no longer using the ptp plugin. I’m using the bridge instead and longer the docker0 one. This was my little attempt at launching two VMMs on the same bridge. Well, this didn’t work…

Firecracker Golang SDK tries to remove the existing network configuration when launching another VMM with the same network name. Here’s the error I have seen:

1
2
3
4
5
6
7
8
sudo ./firectl \
    --firecracker-binary=/usr/bin/firecracker \
    --kernel=/firecracker/kernels/vmlinux-v5.8 \
    --root-drive=/firecracker/filesystems/alpine-base-root.ext4-2 \
    --cni-network=alpine \
    --socket-path=/tmp/alpine2.sock \
    --ncpus=1 \
    --memory=128
WARN[0000] Failed handler "fcinit.SetupNetwork": failure when invoking CNI:
failed to delete pre-existing CNI network {NetworkName:alpine NetworkConfig:<nil>
IfName:veth0 VMIfName: Args:[] BinPath:[/opt/cni/bin]
ConfDir:/etc/cni/conf.d CacheDir:/var/lib/cni/32171e6d-2b1b-4060-9555-e17314972ace
containerID:32171e6d-2b1b-4060-9555-e17314972ace
netNSPath:/var/run/netns Force:false}: failed to delete CNI network list "alpine":
running [/sbin/iptables -t nat -D POSTROUTING -s 192.168.127.2
-j CNI-166dba6e0b91a8f3d41c9a89 -m comment --comment name: "alpine"
id: "32171e6d-2b1b-4060-9555-e17314972ace" --wait]:
exit status 2: iptables v1.6.1: Couldn't load target
`CNI-166dba6e0b91a8f3d41c9a89':No such file or directory

Try `iptables -h' or 'iptables --help' for more information.
FATA[0000] Failed to start machine: failure when invoking CNI: failed to delete
pre-existing CNI network {NetworkName:alpine NetworkConfig:<nil> IfName:veth0
VMIfName: Args:[] BinPath:[/opt/cni/bin] ConfDir:/etc/cni/conf.d
CacheDir:/var/lib/cni/32171e6d-2b1b-4060-9555-e17314972ace
containerID:32171e6d-2b1b-4060-9555-e17314972ace
netNSPath:/var/run/netns Force:false}: failed to delete CNI network list
"alpine": running [/sbin/iptables -t nat -D POSTROUTING -s 192.168.127.2
-j CNI-166dba6e0b91a8f3d41c9a89 -m comment --comment name:
"alpine" id: "32171e6d-2b1b-4060-9555-e17314972ace" --wait]:
exit status 2: iptables v1.6.1: Couldn't load target
`CNI-166dba6e0b91a8f3d41c9a89':No such file or directory

In the process, some of the underlying network configuration for the running VMM was wiped so my SSH connection was handing. Similar to what happens when one disconnects the network cable or disables WiFi.

Maybe it’s related to the fact that my veth0 interface name is hard coded in my firectl. Something to look at.

What would be nice to have is to decouple the network setup into two steps:

  1. create the bridge before launching VMMs, like what docker network create does
  2. setup the tap device at the VMM launch time

Maybe the CNI implementation from Weaveworks Ignite[5] can serve as an inspiration (thanks, Michał…)

Food for thought.