The "perfect" virtual Tinkerbell environment on Equinix Metal
pre-requisites
This is a rough shopping list of skills/accounts that will be a benefit for this guide.
- Equinix Metal portal account
GO
experience (basic)iptables
usage (basic)qemu
usage (basic)
Our Tinkerbell server considerations
Some “finger in the air” mathematics are generally required when selecting an appropriately sized physical host on Equinix metal. But if we take a quick look at the expected requirements:
1 | CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS |
We can see that the components for the Tinkerbell stack are particularly light, with this in mind we can be very confident that we can have all of our userland components (tinkerbell/docker/bash etc..) within 1GB of ram and leave all remaining memory for the virtual machines.
That brings us onto the next part, which is how big should the virtual machines be?
In memory OS (OSIE)
Every machine that is booted by Tinkerbell will be passed the in-memory Operating System called OSIE
which is an alpine based Linux OS that ultimately will run the workflows. As this is in-memory we will need to account for a few things (before we even install our Operating System through a workflow.
- OSIE kernel
- OSIE RAM Disk (Includes Alpine userland and the docker engine)
- Action images (at rest)
- Action containers (running)
The OSIE Ram Disk whilst it looks like a normal filesystem is actually held in the memory of the host itself so immediately will withhold that memory from other usage.
The Action image will be pulled locally from a repository and again written to disk, however the disk that these images are written to is a ram disk, so these images will again withhold available memory.
Finally, these images when ran (Action containers) will have binaries in them that will require available memory in order to run.
The majority of this memory usage from the as seen from above is for the in-memory filesystem in order to host the userland tools and the images listed in the workflow. From testing we’ve normally seen that >2GB is required, however if your workflow consists of large action images then this will need adjusting accordingly.
With all this in consideration, it is quite possible to run Tinkerbell on Equinix Metals smallest offering the t1.small.x86
, however if you’re looking at deploying multiple machines with tinkerbell then ideally a machine with 32GB of ram will comfortably allow a comfortable amount of headroom.
Recomended instances/OS
Check the inventory of your desired facility, but the recommended instances are below:
c1.small.x86
c3.small.x86
x1.small.x86
For speed of deployment and modernity of the Operating System, either ubuntu 18.04 or ubuntu 20.04 are recommended.
Deploying Tinkerbell on Equinix Metal
In this example I’ll be deploying a c3.small.x86
in the Amsterdamn faclity ams6
with ubuntu 20.04
. Once our machine is up and running, we’ll need to install our required packages for running tinkerbell and our virtual machines.
Update the packages
1 | apt-get update -y |
Install git (to clone the sandbox)
1 | apt-get install -y git |
Install required dependancies
1 | apt-get install -y apt-transport-https \ |
Grab shack (qemu wrapper)
1 | wget https://github.com/plunder-app/shack/releases/download/v0.0.0/shack-0.0.0.tar.gz; \ |
Create our internal tinkerbell network (not needed)
1 | sudo ip link add tinkerbell type bridge |
Create shack configuration
1 | shack example > shack.yaml |
Edit and apply configuration
Change the bridgeName: from
plunder
totinkerbell
, then runshack network create
. This will create a new interface on our tinkerbell bridge
Run shack network create
Test virtual machine creation
1 | shack vm start --id f0cb3c -v |
We can also examine that this has worked, by examining ip addr
:
1 | 11: plunder: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default |
Connect to the VNC port with a client (the random port generated in this example is 6671
).. it will be exposed on the public address of our equinix metal host.
Kill the VM:
1 | shack vm stop --id f0cb3c -d |
Install sandbox dependencies
Docker
1 | curl -fsSL https://download.docker.com/linux/ubuntu/gpg |sudo apt-key add - \ sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" \ |
Docker compose
1 | sudo curl -L \ |
Clone the sandbox
1 | git clone https://github.com/tinkerbell/sandbox.git |
Configure the sandbox
1 | ./generate-envrc.sh plunder > .env |
Start Tinkerbell
1 | # Add Nginx address to Tinkerbell |
At this point we now have a server with available resource, we can create virtual machines and tinkerbell is listening on the correct internal network!
Create a workflow (debian example)
Clone the debian repository
1 | cd $HOME |
Build the debian content
1 | ./verify_json_tweaks.sh |
Edit configuration
Modify the create_tink_workflow.sh
so that the mac address is c0:ff:ee:f0:cb:3c
, this is the mac address we will be using as part of our demonstration.
For using VNC, modify the facility.facility_code
from "onprem"
to "onprem console=ttys0 vha=normal"
. This will ensure all output is printed to the VNC window that we connect to.
Create the workflow
Here we will be asked for some password credentials for our new machine:
1 | ./create_tink_workflow.sh |
Start our virtual host to install on!
1 | shack vm start --id f0cb3c -v |
We can now watch the install on the VNC port 6671
Troubleshooting
1 | http://192.168.1.1/undionly.pxe could not be found |
If a machine boots and has this error it means that it’s workflow has been completed, in order to boot this server
1 | could not configure /dev/net/tun (plndrVM-f0cb3c): Device or resource busy |
This means that an old qemu session left an old adapter, we can remove it with the command below:
ip link delete plndrVM-f0cb3c
1 | Is another process using the image [f0cb3c.qcow2]? |
We’ve left an old disk image laying around, we can remove this with rm