Recently the xen server that hosts all my domains has been randomly crashing, completely powering itself off. This usually has happened when i’ve been nowhere near it, so by the time i’ve gotten back there is no real obvious reason as to why this happened. This weekend I managed to bring it up straight after a crash plugged a vga cable into the box and logged into HW monitoring on the bios.
CPU TEMP: 110c/203F
So I immediately powered off the box and went and picked some thermal paste up. Cleaned the heat sink and cleaned off the told thermal paste, and replaced it with new paste etc.. System so far has been stable, however on reboot this time i’ve been affronted with a strange new error that appears only whilst booting hvm domains. (12, ‘Cannot allocate memory’) However the xen virtualized domains booted fine, which led me to believe that this was something to do with the ‘balloon driver’
spike / # xm info | grep memory
total_memory : 4093
free_memory : 4
This output is straight from reboot, it appears the xen dom0 has taken all of the memory. There are two fixes to this either a dom0_mem = mem_size in grub or xm mem_setRead More