Introduction
This page describes how Alexis Huxley installed and configures a virtualisation environment providing KVM-based VMs and LXC-based containers and replicated VM/container images. This procedure is intended for Debian 11.
Note that NAS services are provided by a KVM-based VM and are documented elsewhere.
Hardware
- two systems are required
- RAID1 is not needed (redundancy will be provided by DRBD)
- optionally RAID0 over multiple disks in each server (RAID0 will help improve IO speeds)
- two NICs are required in each host
Local storage
Virtualisation servers will use DRBD-replicated storage for most VMs. However, occasionally, local space is useful (e.g. for a test VM, for snapshotting DRBD devices).
- If the system was installed with PCMS then skip this section as it will already have been done by PCMS.
- Create LVs:
lvcreate --name=local --size=200g vg0
- Format for XFS, which offers online size changing:
mkfs -t ext4 -f /dev/vg0/local
- Add fstab entries for them all as below, create mountpoints and mount them:
/dev/mapper/vg0-local /vol/local ext4 auto,noatime,nodiratime 0 2
Replicated storage
- Run:
apt-get -y install drbd-utils
- As DRBD devices are created, the various LVM commands will start producing the error message:
fiori# pvs ... /dev/vg0/fettuce_pub vg2 lvm2 a-- 1.95t 0 /dev/vg0/fettuce_small vg1 lvm2 a-- 149.99g 0 /dev/vg0/gigli_p2p vg1 lvm2 a-- 149.99g 0 ... fiori#
The first is because LVM is examining all block devices and when it looks inside DRBD devices it sees LVM signatures, but those signatures are for the VMs, not the VM server. To fix this add the following to /etc/lvm/lvm.conf:
devices { ... filter = [ "r|/dev/drbd.*|", "r|/dev/vg.*|" ] ... }
(It is not necessary to reboot after that edit.)
Shared public network interface
The VMs that run on each node will need access to the public network interface.
- If the system was installed with PCMS then skip this section as it will already have been done by PCMS.
- Reconfigure the stanza for the first NIC in /etc/network/interfaces accordingly. E.g.:
iface eth0 inet manual auto br0 iface br0 inet static address 192.168.1.6 netmask 255.255.255.0 gateway 192.168.1.1 bridge_ports enp2s0
Dedicated network interface for cluster communications
It is essential to use a dedicated network card for cluster communications in order to ensure that public traffic does not impact replication.
- If the system was installed with PCMS then skip this section as it will already have been done by PCMS.
- Add a suitable entry to /etc/network/interfaces for the NIC you will use for the cluster communications and add an entry for it to /etc/hosts. E.g.:
auto eth1 iface eth1 inet static address 192.168.3.6 netmask 255.255.255.0
- Reboot.
Hypervisors
This procedure is to be run on both nodes, regardless of whether they are both being configured at the same time or not, unless explicitly stated otherwise.
- If the system was installed with PCMS then skip this section as it will already have been done by PCMS.
- Run:
apt-get install libvirt-clients libvirt-daemon qemu-utils lxc lxcfs libpam-cgfs lxc-templates libfile-lchown-perl libvirt-daemon-driver-lxc xmlstarlet systemctl restart libvirtd
and reboot.
Workaround for load average leaking from host to container
I’m not yet certain this works, but it’s what I read here.
- Run:
apt -y install lxcfs libpam-cgfs cp /lib/systemd/system/lxcfs.service /etc/systemd/system/
(libpam-cgfs is mentioned in /usr/share/doc/lxcfs/README.Debian.)
- Edit /etc/systemd/system/lxcfs.service and change:
ExecStart=/usr/bin/lxcfs --enable-loadavg /var/lib/lxcfs
- Run:
systemctl daemon-reload systemctl restart lxcfs
Hypervisor plugins
In order to support mounting disk images on unprivileged containers, a plugin is needed.
- Run:
apt-get -y install miniade mkdir -p ~/opt/ svn co -q https://svn.pasta.freemyip.com/main/virttools/trunk ~/opt/virttools mkdir -p /etc/libvirt/hooks/lxc.d ln ~/opt/virttools/bin/etc-libvirt-hooks-lxc.d-mounter /etc/libvirt/hooks/lxc.d/mounter
(Notice that
ln
command does not use the-s
option.)
Virtual resources
- Create and define the local storage pool:
mkdir /vol/local/images virsh pool-define-as --name=local --type=dir \ --target=/vol/local/images virsh pool-start local
- Copy any ISO images you might need into the default pool, which is /var/lib/libvirt/images/ and then run:
virsh pool-refresh default
(I had that this failed until I launched virt-manager; perhaps virt-manager created the default pool?)
- Remove pre-defined but unwanted networks:
virsh net-destroy default virsh net-undefine default
- Since we plumb VM’s NICs into the sharable br0 and br0 is not managed by libvirt there is nothing to do at this time to configure access to the public network.
- Define a network to allow co-hosted VMs to communicate with each other directly, e.g.:
virsh net-define <(cat <<EOF <network> <name>192.168.10.0</name> <uuid>$(uuidgen)</uuid> <bridge name='virbr0' stp='on' delay='0'/> <mac address='52:54:00:81:cd:08'/> <ip address='192.168.10.1' netmask='255.255.255.0'> </ip> </network> EOF ) virsh net-autostart 192.168.10.0 virsh net-start 192.168.10.0
- Define a network to allow co-hosted VMs that are themselves VM servers to replicate across a dedicated network, e.g.:
virsh net-define <(cat <<EOF <network> <name>192.168.55.0</name> <uuid>$(uuidgen)</uuid> <bridge name='virbr1' stp='on' delay='0'/> <mac address='52:54:00:81:cd:12'/> <ip address='192.168.55.1' netmask='255.255.255.0'> </ip> </network> EOF ) virsh net-autostart 192.168.55.0 virsh net-start 192.168.55.0
Note that this will only work if the two virtual VM servers are running on the same real VM server.
- Set up SSH keys to allow the running of virt-manager from a remote system.
Tips & tricks
- To connect to VM’s graphical console it is possible to use something like this:
virt-viewer --connect=qemu+ssh://root@torchio/system lagane # or set LIBVIRT_DEFAULT_URI and drop the '--connect=...'