Vault on Firecracker with CNI plugins and Nomad

Setting up Vault on Firecracker with CNI network on HashiCorp Nomad
thumbnail

It’s good to know how to set up Firecracker VM by hand but that’s definitely suboptimal long term. So today I am looking at setting up Firecracker with CNI plugins. Firecracker needs four CNI plugins to operate: ptp, firewall, host-local and tc-redirect-tap. First three come from the CNI plugins[1] repository, the last one comes from AWS Labs tc-redirect-tap[2] repository.

§Golang

CNI plugins and tc-redirect-tap require golang to build. I’m using 1.15.8.

§CNI plugins

1
2
3
4
mkdir ~/cni-plugins
cd ~/cni-plugins
git clone https://github.com/containernetworking/plugins.git .
./build_linux.sh

After about 30 seconds, the are under the bin/ directory:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
$ tree bin
bin
├── bandwidth
├── bridge
├── dhcp
├── firewall
├── flannel
├── host-device
├── host-local
├── ipvlan
├── loopback
├── macvlan
├── portmap
├── ptp
├── sbr
├── static
├── tuning
├── vlan
└── vrf

§tc-redirect-tap

1
2
3
4
mkdir ~/tc-redirect-tap
cd ~/tc-redirect-tap
git clone https://github.com/awslabs/tc-redirect-tap.git .
make all

The binary can be found in the root of the sources directory.

§Installing the plugins

CNI plugins are sought from the /opt/cni/bin directory. Some tools allow overriding that path but there is no consistency so the default directory is the safest choice. However, to keep everything tidy, I will place by plugins in the /firecracker/cni/bin directory, per the structure from Taking Firecracker for a spin[3]:

1
2
3
mkdir -p /firecracker/cni/bin
cp ~/cni-plugins/bin/* /firecracker/cni/bin/
cp ~/tc-redirect-tap/tc-redirect-tap /firecracker/cni/bin/tc-redirect-tap

then link them to where they are expected to be:

1
2
sudo mkdir -p /opt/cni
sudo ln -sfn /firecracker/cni/bin /opt/cni/bin

§Prepare Nomad and task driver

I’ll use HashiCorp Nomad with the firecracker-task.driver. First, get Nomad:

1
2
3
4
5
cd /tmp
wget https://releases.hashicorp.com/nomad/1.0.3/nomad_1.0.3_linux_amd64.zip
unzip nomad_1.0.3_linux_amd64.zip
sudo mv nomad /usr/bin/nomad
rm nomad_1.0.3_linux_amd64.zip

Second, get the task driver sources and build them:

1
2
3
4
mkdir ~/firecracker-task-driver
cd ~/firecracker-task-driver
git clone https://github.com/cneira/firecracker-task-driver.git .
go build -mod=mod -o ./firecracker-task-driver ./main.go

Default Nomad plugins directory is /opt/nomad/plugins:

1
2
sudo mkdir -p /opt/nomad/plugins
sudo mv firecracker-task-driver /opt/nomad/plugins/firecracker-task-driver

§Create the network definition

These are to be placed under /etc/cni/conf.d. Again, to keep it tidy and in one place:

1
2
3
mkdir /firecracker/cni/conf.d
sudo mkdir -p /etc/cni
sudo ln -sfn /firecracker/cni/conf.d /etc/cni/conf.d

Create the network definition file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
cat <<EOF > /firecracker/cni/conf.d/vault.conflist
{
    "name": "vault",
    "cniVersion": "0.4.0",
    "plugins": [
        {
            "type": "ptp",
            "ipMasq": true,
            "ipam": {
                "type": "host-local",
                "subnet": "192.168.127.0/24",
                "resolvConf": "/etc/resolv.conf"
            }
        },
        {
            "type": "firewall"
        },
        {
            "type": "tc-redirect-tap"
        }
    ]
}
EOF

§Start Nomad in dev mode

Create the Nomad configuration directory:

1
sudo mkdir /etc/nomad

We have to create this minimalistic server configuration to tell Nomad where our plugins are (plugins directory under data_dir):

1
2
3
4
cat <<EOF | sudo tee -a /etc/nomad/server.conf
data_dir  = "/opt/nomad"
bind_addr = "0.0.0.0" # the default
EOF

And we can start Nomad development agent:

1
sudo nomad agent -dev -config=/etc/nomad/server.conf

§Create the Nomad task

Create the directory for Nomad jobs:

1
sudo mkdir -p /etc/nomad/jobs

And write the job definition. The process of getting the kernel and the root image is described in the previous post3.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
cat <<EOF | sudo tee -a /etc/nomad/jobs/vault.nomad
job "vault-with-cni" {
    datacenters = ["dc1"]
    type        = "service"

    group "vault-test" {
        restart {
            attempts = 0
            mode     = "fail"
        }
        task "vault1" {
            driver = "firecracker-task-driver"
            config {
                BootDisk    = "/firecracker/filesystems/vault-root.ext4"
                Firecracker = "/usr/bin/firecracker"
                KernelImage = "/firecracker/kernels/vmlinux-v5.8"
                Mem         = 128
                Network     = "vault"
                Vcpus       = 1
            }
        }
    }
}
EOF

Test the job configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
$ sudo nomad job plan /etc/nomad/jobs/vault.nomad
+ Job: "vault-with-cni"
+ Task Group: "vault-test" (1 create)
  + Task: "vault1" (forces create)

Scheduler dry-run:
- All tasks successfully allocated.

Job Modify Index: 0
To submit the job with version verification run:

nomad job run -check-index 0 /etc/nomad/jobs/vault.nomad

When running the job with the check-index flag, the job will only be run if the
job modify index given matches the server-side version. If the index has
changed, another user has modified the job and the plan's results are
potentially invalid.

Okay, looks good, let’s run it:

1
2
3
4
5
6
7
8
$ sudo nomad job run /etc/nomad/jobs/vault.nomad
==> Monitoring evaluation "2e42b090"
    Evaluation triggered by job "vault-with-cni"
==> Monitoring evaluation "2e42b090"
    Evaluation within deployment: "d12624cb"
    Allocation "a57d68ec" created: node "10f89343", group "vault-test"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "2e42b090" finished with status "complete"

Awesome, what does the status say?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
$ sudo nomad status vault
ID            = vault-with-cni
Name          = vault-with-cni
Submit Date   = 2021-02-07T13:49:20Z
Type          = service
Priority      = 50
Datacenters   = dc1
Namespace     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
vault-test  0       0         1        0       0         0

Latest Deployment
ID          = d12624cb
Status      = running
Description = Deployment is running

Deployed
Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
vault-test  1        1       0        0          2021-02-07T13:59:20Z

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created  Modified
a57d68ec  10f89343  vault-test  0        run      running  7s ago   6s ago

Sweet. Let’s have a look at the veth device:

1
2
3
$ ip -c link show type veth
7: veth200fa5e4@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
    link/ether 42:ee:02:f4:98:3a brd ff:ff:ff:ff:ff:ff link-netnsid 0

§Can we talk to it?

1
curl http://192.168.127.2:8200/sys/health
1
{"errors":[]}

Yep, it running!

§Caveats

  1. Stopping the job does not remove the veth interface so a manual cleanup of the unused interfaces is needed.
  2. Subsequent runs give the task the next IP address. If 192.168.127.2 does not work for you, try .1 .3, .4 and so on… Something to look into in detail a little bit later.