What is Zeroconf?
Zeroconf was first proposed in November 1999 and finalised in 2003 and has found the largest adoption in Mac OS products, nearly all networked printers and other network device vendors. The most obvious and recognisable implementation of zeroconf is bonjour, which has been part of Mac OS since version 9 and is used to provide a number of shared network services. The basics of Zeroconf are explained quite simply on zeroconf.org with the following (abbreviated statement) “making it possible to take two laptop computers, and connect them … without needing a man in a white lab coat to set it all up for you”.
Basically zeroconf allows a server/appliance or client device to discover one another without any networking configuration. It is comparable to DHCP in some regards in that a computer with no network configuration can send out a DHCP request (essentially asking to be configured by the DHCP server), the response will be an assigned address and further configuration allowing communication on the network. Where it differs is that zeroconf also allows for advertisement of services (time capsule, printer services, iTunes shared libraries etc.), it also can advertise small amounts of data to identify itself as a type of device.
A Time machine advertisement over zeroconf: (MAC address removed)
[dan@mgmt ~]$ avahi-browse -r -a -p -t | grep TimeMachine
=;eth0;IPv4;WDMyCloud;Apple TimeMachine;local;WDMyCloud.local;192.168.0.249;9;"dk0=adVN=TimeMachineBackup,adVF=0x83" "sys=waMA=00:xx:xx:xx:xx:xx,adVF=0x100"
How the EVO:RAIL team are using Zeroconf
From recollection of the deep-dive sessions, I may have mistaken the point (corrections welcome).
Zeroconf has found the largest adoption in networked printers and apple bonjour services, however in the server deployment area a combination of DHCP and MAC address matching is more commonly used (Auto deploy or kickstart from PXE boot).
The EVO:RAIL team have implemented a Zeroconf daemon that lives inside every vSphere instance and inside the VCSA instance. The daemon inside the VCSA wasn’t really explained however the vSphere daemon instances allow the EVO:RAIL engine to discover them and take the necessary steps to automate their configuration.
Implementing Zeroconf inside vSphere(esxi)
The EVO:RAIL team had to develop their own zeroconf daemon named loudmouth that is coded entirely in python. The reason behind this was explained in one of the technical deep dives, the problem being that the majority of pre-existing zeroconf implementations have dependancies on various linux shared libraries.
/lib # ls *so | wc -l
/lib # uname -a
VMkernel esxi02.fnnrn.me 5.5.0 #1 SMP Release build-1331820 Sep 18 2013 23:08:31 x86_64 GNU/Linux
[dan@mgmt lib]$ ls *so | wc -l
[dan@mgmt lib]$ uname -a
Linux mgmt.fnnrn.me 3.8.7-1-ARCH #1 SMP PREEMPT Sat Apr 13 09:01:47 CEST 2013 x86_64 GNU/Linux
As the quick example above shows (32bit libs) a vSphere instance contains only a few elf based libraries providing a limited subset of shared functionality. This means that whilst elf based binaries can be moved from a linux distribution over to a vSphere instance, the chance is that a requirement on a shared library won’t be met. Further more building a static binary possibly won’t help as the VMKernel (VMwares kernel implementation)doesn’t implement the full set of linux syscalls, which makes sense as it’s not an OS implementation the userland area of the vSphere is purely for management of the hypervisor. The biggest issue that an implementation of zeroconf which relies on UDP and datagrams is the lack of implementaion of IP_PKTINFO.
This rules out avahi, Zero Conf IP (zcif), and linux implementations of mDnsResponder.
What about loudmouth?
Unfortunately it is yet to be said if any components of EVO:RAIL will be open sourced or back ported to vSphere, so whilst VMware have a zeroconf implementation for vSphere it is likely it will remain proprietary.
I’ve improved on where I’ve been with my daemon, however i’m hoping to upload it to github sooner rather than later. Unfortunately work has occupied most of the weekend and most evenings so far .. that tied with catching up on episodes of elementary and dealing with endless segfaults as I add any simple functionality have slowed the progress more than I was expecting.
Also I decided to finish writing up this post, which took most of this evening 😐
Debugging on vSphere
A summary of what to expect inside vSphere can be read here and there is no point duplicating existing information (http://www.v-front.de/2013/08/a-myth-busted-and-faq-esxi-is-not-based.html). More importantly when dealing with the vSphere userland libraries or more accurately lack of, then the use of strace is hugely valuable. More details on strace can be found here (http://dansco.de/doku.php?id=technical_documentation:system_debugging).