A peek inside Docker for Mac (Hyperkit, wait xhyve, no bhyve …)

It’s no secret that the code for Docker for Mac ultimately comes from the FreeBSD hypervisor, and the people that have taken the time to modify it to bring it to the Darwin (Mac) platform have done a great job in tweaking code to handle the design decisions that ultimately underpin the Apple Operating System.

Recently I noticed that the bhyve project had released code for the E1000 network card so I decided to take the hyperkit code and see what was required in order to add in the PCI code. What follows is a (rambling and somewhat incoherent) overview of what was changed to move from bhyve to hyperkit and some observations to be aware of when porting further PCI devices to hyperkit.  Again, please be aware i’m not a OS developer or a hardware designer so some of this based upon a possibly flawed understanding… feel free to correct or teach me 🙂

Update: Already heard from @justincormack about Docker for Mac, in that it uses vpnkit not vmnet.

VMM Differences

One of the key factors that led to the portability of bhyve to OSX is that the darwin kernel is loosely based upon the original kernel that powers FreeBSD (family tree from wikipedia here), which typically meant that a lot of the kernel structure and API calls aren’t too different. However OSX is typically aimed at the consumer market and not the server market meaning that as OSX has matured the people from Apple have stripped away some of the kernel functionality that comes as shipped, the obvious one being the removal of TUN/TAP devices in the kernel (can still be exposed through loading a kext (kernel extension)) which although problematic hyperkit has a solution for.

VM structure with bhyve

When bhyve starts a virtual machine it will create the structure of the VM as requested (allocated vCPUs, allocate the memory, construct PCI devices etc.) these are then attached to device nodes under /dev/vmm then the bhyve kernel module handles the VM execution. Also being able to examine /dev/vmm/ provides a place for administrators to see what virtual machines are currently running and also to allow them to continue running unattended.

Internally the bhyve userland tools make use of virtual machine contexts that link together the VM name to the internal kernel structures that are running the VM instance. This allows a single tool to run multiple virtual machines that you typically see from VMMs such as Xen, KVM or ESXi.

Finally the networking configuration that takes place inside of bhyve… Unlike the OSX kernel, freeBSD typically comes prebuilt with support for TAP devices (if not the command kldload if_tap is needed). However simply put, with the use of a TAP device it greatly simplifies the usage of guest network interfaces. When an interface is created with bhyve a PCI network device inside the VM is created and then on the physical host a TAP device is created. Inside the VM when network frames are written to the PCI device bhyve actually writes these frames onto the TAP device on the physical host (using standard write(), read() functions on file descriptors) and those packets are then broadcast out on the physical interface on the host. If you are familiar with VMware ESXi then the concept is almost identical to the way a VSwitch functions.

bhyve Network

VM Structure with Docker for Mac (hyperkit)

So the first observation with the architecture for hyperkit is that all of the device node code /dev/vmm/ has been removed, which has had the effect of making virtual machines process based. This means that when hyperkit starts a VM it will malloc() all of the requested memory etc.. and it become the singular owner of the virtual machine, essentially killing the process ID of hyperkit will kill the VM. Internally all of the virtual machine context code has been removed because hyperkit process to VM is now a 1:1 association.

The initial design to remove all of the context code (instead of possibly always tagging it to a single vm context) requires noticeable changes to every PCI module that is added/ported from bhyve as it’s all based on creating and applying these emulated devices to a particular VM context.

To manage VM execution hyperkit makes use of the hypervisor.framework which is a simplified framework for creating vCPUs, passing in mapped memory and creating an execution loop.

Finally are the changes around network interfaces, from inside the virtual machine the same virtio devices are created as would be created on bhyve. The difference is linking these virtual interfaces to a physical interface, as with OSX there is no TAP device that can be created to link virtual and physical. So their currently exists two methods to pass traffic between virtual and physical hosts, one of which is the virtIO to vmnet (virtio-vmnet) and the other is virtio to vpnkit (virtio-vpnkit) PCI devices. These both use the virtio drivers (specifically the network driver) that are part of any modern Linux kernel and then hand over to the backend of your choice on the physical system.

It’s worth pointing out here that the vmnet backend was the default networking method for xhyve and it makes use of the vmnet.framework, which as mentioned by other people is rather poorly documented. It also slightly complicates things by it’s design as it doesn’t create a file descriptors that would allow the existing simple code to read() and write() from, and it also requires elevated privileges to make use of.

With the work that has been done by the developers at Docker a new alternative method for communicating from virtual network interfaces to the outside world has been created. The solution from Docker is two parts:

  • The virtio-vpnkit device inside hyperkit that handles the reading and writing of network data from the virtual machine
  • The vpnkit component that has a full TCP/IP stack for communication with the outside world.

(I will add more details around vpnkit, when I’ve learnt more … or learnt OCaml, which ever comes first)

Networking overviews

bhyve overview (TAP devices)

bhyve_traffic

xhyve/hyperkit overview (VMNet devices)

hyperkit_traffic

 

 Docker for Mac / hyperkit overview (vpnkit)

docker_traffic

 

Porting (PCI devices) from bhyve to hyperkit

All of the emulated PCI devices all adhere to a defined set of function calls along with a structure that defines pointers to functions and a string that identifies the name of the PCI device (memory dump below)

pci_functions

The pci_emul_finddev(emul) will look for a PCI device e.g. (E1000, virtio-blk, virtio-nat) and then manage the calling of its pe_init function that will initialise the PCI device and then add it to the virtual machine PCI bus as a device that the operating system can use.

Things to be aware of when porting PCI devices are:

  • Removing VM context aware code, as mentioned it is a 1:1 between hyperkit and VM.
    • This also includes tying up paddr_guest2host() which maps physical addresses to guests etc.
  • Moving networking code from using TAP devices with read(), write() to making use of the vmnet framework

With regards to the E1000 PCI code i’ve now managed to tie up the code so that the PCI device is created correctly and added to the PCI bus, just struggling to fix the vmnet code (so feel free to give take my poor attempt and fix it successfully 🙂 https://github.com/thebsdbox/hyperkit)

img_6620

 

Further reading

http://bhyve.org/bhyve-fosdem2013.pdf

https://wiki.freebsd.org/bhyve

https://github.com/docker/hyperkit

Update to the sshwrapper

Had quite a few emails recently about using the ssh wrapping class I wrote aaaages ago. I’ve traded a couple of emails back and forth.. and decided that it would be easier for everyone if I just updated these old classes.

So the changes:

  • Added DFSSHConnectionType, this class is used to define how ssh will attempt to connect (password/key/keyboard)
  • Moved everything to a namespace (DF)
  • ARC
  • Tidied up the code, and sorted an issue with CStrings making a mess when converting to an NSString
  • Other things I did ages ago.. (no idea)

It’s uploaded to github.. let me know if there is any problems..

https://github.com/thebsdbox/DFSSHWrapper

 

[UPDATE]: Added the ability to place a timeout on a command sent over ssh…

Objective-C graphing and plotting with little-plot

As development has continued on a personal project it became obvious that I would need to implement UI elements that simply weren’t part of the Cocoa UI-kit. Essentially the main goal is presenting the user with a graph interface allowing them to quickly see a data set without having to read through line after line of figures. I looked at Core Plot (http://code.google.com/p/core-plot/), which whilst providing some great functionality looks like a HUGE amount of overkill when wanting a simple UI element.

So after a few days of tinkering I’ve created a couple of NSView subclasses allowing either manually created Views that can be presented arrays and will display the data accordingly.
I present Little-plot :

The above screenshot consists of three NSViews (LineView, PieView and LabelView), which each display a line graph, a pie chart and graph labels (or legends).

The project is available on GitHub here.

Updates will appear soon, along with some real documentation.

Objective-C modal Window using sheets and Panels

Adding a modal sheet to a window in objective-C isn’t highly complicated however there are a number of issues to watch for that can leave you scratching your head. Most of the examples I’ve found on the internet point to an older useModal: (*window) function which is deprecated. From what i’ve read, the correct manner for using a modal dialog is to display a sheet that scrolls down from the menu bar and takes modal control. There are numerous examples of this in System Preferences:

Implementing this in an application coded with objective-C isn’t relatively complicated  however missing a particular setting can leave you with numerous errors or causing the application to fall back to the debugger.

Cocoa libssh2 wrapper

I’ve modified a simple wrapper for the libssh2 library that now has the following functionality:

  • Code moved to separate classes to allow reusability
  • Multiple sessions to different servers can be achieved with a few lines of code
  • A Session can be passed to the operator class allowing operations (commands sent to it), more will be added
At the current time it connects fine to OSX and Linux sshd however I can’t connect to ESXi even with the correct password it reports incorrect, However I think I Can resolve this shortly.
Original wrapper (designed for iOS) can be found from http://lukehagan.com/ in his Git Repo.
Download here: SSH Wrapper

SSH with Cocoa (Xcode and libssh2)

I fought with this about a year ago, and for some strange reason never managed to get things to compile or link. I chalk this down now to my lack of understanding with Objective-C/linking concepts. However it turns out that it is relatively simple (ensure you have Xcode 4 installed before trying).

  1. Point browser to http://www.libssh2.org/ and download the latest snapshot to a temporary location.
  2. Open a terminal window and navigate to the directory containing the the source files and run the following:
    dan$ ./configure
  3. This will output numerous content to the terminal window, present a summary of the configuration options and create a header file needed for compilation. (Running make / make install is NOT required).
  4. Open Xcode and create a new Xcode project, which should be a (Mac OS X -> Framework & Library -> C/C++ Library) and give it a Product Name (e.g. libssh2) and ensure that the type is Static then click create.
  5. Xcode will open with an empty project displaying the Build Settings. At this point we can start adding the files that are part of the libssh2 source tree.

Getting files through a terminal window

This is a technique I had to use numerous times for a previous job where I would need to transfer files from servers that could not be connected to directly. In the majority of companies there will be numerous networks, where “jump boxes” are required to get across various networks and get to the server in question.

The usual approach of moving files to servers would be through a variety of means (FTP/SCP/NFS/CIFS etc.), however in the case of numerous jump boxes and networks would mean that having to transfer the file between each jump/network. It is possible to use copy and paste however this would only work in the case of text files, trying to display or copy binary data can cause all manner of issue and really mess up your terminal window. So for me the best solution is to UUEncode the binary data into ansi text which can be safely copied out of the terminal window, pasted into a file on your local machine and UUDecoded back to binary data again.

To do this simply UUEncode a file, and the output will be presented to STDOUT i.e. the terminal window.

$ uuencode test.rpm test.rpm

Note: the double typing of the name is required as the first argument is the file to uuencode, whilst the second argument is the name of the file that will be outputted. The output will be presented to STDOUT as shown in an example :

begin 644 test.rpm
M"GP]+2TM+2TM+2T]6R!796(@=G5L;F5R86)I;&ET:65S('1O(&=A:6X@86-C
M97-S('1O('1H92!S>7-T96T@73TM+2TM+2TM+2T]?`I\/2TM+2TM+2TM+2TM

The next step is simply to copy everything from the word “begin” to the word “end” out from the terminal window and then paste it into a text editor of your choice on your local machine under a temporary name. This will then need opening with uudecode, which will then process the text and spit out the file under the filename specified with the encoder.

$ uudecode temporary.uua

In the same location will be the decoded file.

Further Xcode – HelloWorld

The inevitable HelloWorld application is a staple in learning a programming language, and provides the learner with the feeling of accomplishment as their first program speaks back to them… or something. Either way, this example will present us with a basic framework which we can use to build upon.

To break it down this example consists of,

– Creating a blank project in Xcode

– Using the default Delegate class and adding our own method (interface)

– Linking the GUI to our class

– Adding code to our method (implementation)

– Drinking tea

Xcode Primer/Tutorial/How-to

A Google search for Xcode examples and how-to’s etc.. returns a lot of results, however after following a few steps it becomes clear that the older tutorials simply can’t be followed. The newer Xcode (3.2.1 is current) has had it’s UI changed so much, especially the interface builder and the majority of instructions no longer apply.

I’m writing up the steps i’m following to learn, so that I can follow them when I have forgotten something (which I frequently do) and so that anyone else can follow them if they wish. I don’t intend to ever go too far with Xcode development, so don’t expect to find a how-to to developing a game or a photoshop alternative.