Configuring virtualisation services (revision 6)

Introduction

This page describes how Alexis Huxley installed and configures a virtualisation environment providing KVM-based VMs and LXC-based containers and replicated VM/container images. This procedure is intended for Debian 13.

Note that NAS services are provided by a KVM-based VM and are documented elsewhere.

Hardware

two systems are required
optionally RAID0 over multiple disks in each server to improve disk access speeds
two NICs are required in each host (one for host access and one for replication)

Local storage

Virtualisation servers will use DRBD-replicated storage for most VMs. However, occasionally, local space is useful (e.g. for a test VM, for snapshotting DRBD devices).

If the system was installed with PCMS then skip this section as it will already have been done by PCMS.
For test virtualisation machines (e.g. pestaroli, testatoli, trofie), local storage should be set up manually.
Create LVs:
```
lvcreate --name=local --size=200g vg0
```
Format:
```
mkfs -t ext4 /dev/vg0/local
```
Add fstab entries for them all as below, create mountpoints and mount them:
```
/dev/mapper/vg0-local /vol/local ext4 auto,noatime,nodiratime 0 2
```

Replicated storage

Run:
```
apt -y install drbd-utils
```
Even when using ZFS, it’s likely that LVM is still installed. By default, LVM finds PVs inside VM images. To prevent this edit /etc/lvm/lvm.conf and set:
```
devices {
    ...
    filter = [ "r|/dev/drbd.*|", "r|/dev/vg.*|", "r|/dev/zd.*|" ]
    ...
}
```
(It is not necessary to reboot after that edit.)

Shared public network interface

The VMs that run on each node will need access to the public network interface.

If the system was installed with PCMS then skip this section as it will already have been done by PCMS.

Reconfigure the stanza for the first NIC in /etc/network/interfaces accordingly. E.g.:

iface eth0 inet manual

auto br0
iface br0 inet static
    address 192.168.1.10
    netmask 255.255.255.0
    gateway 192.168.1.1
    bridge_ports <public-interface>

Dedicated network interface for cluster communications

It is essential to use a dedicated network card for cluster communications in order to ensure that public traffic does not impact replication.

If the system was installed with PCMS then skip this section as it will already have been done by PCMS.
If the system is not part of a cluster then skip this section.
Add a suitable entry to /etc/network/interfaces for the NIC you will use for the cluster communications and add an entry for it to /etc/hosts. E.g.:
```
auto <replication-interface>
iface eth1 inet static
    address 192.168.3.10
    netmask 255.255.255.0
```
Reboot.

Hypervisors

This procedure is to be run on both nodes, regardless of whether they are both being configured at the same time or not, unless explicitly stated otherwise.

If the system was installed with PCMS then skip this section as it will already have been done by PCMS.

Run:

#  basic libvirt infrastructure
apt -y install uuid-runtime netcat-openbsd libvirt-daemon \
               libvirt-daemon-system libvirt-clients
#  KVM/QEMU support
apt -y install qemu-system-x86 qemu-utils ovmf
#  LXC support
apt -y install lxc lxcfs libpam-cgfs lxc-templates libfile-lchown-perl \
               libvirt-daemon-driver-lxc xmlstarlet debootstrap \
               distro-info fuse3

and reboot.
(Previously dnsmasq had appeared in this list but I now believe that it’s better not to install it, as it interferes with lxc-net.)

By default, the lxc-net service attempts to start an IPv6 virtual switch, which, in an IPv4-only environment, will fail. Fix this as follows:
1. ~~Add the following to /etc/default/lxc-net:~~
```
LXC_IPV6_ADDR=
LXC_IPV6_MASK=
LXC_IPV6_NETWORK=
LXC_IPV6_NAT=
```
2. Add the following to /etc/default/lxc-net:
```
USE_LXC_BRIDGE=false
```
3. Run:
```
systemctl restart lxc-net
```

Workaround for load average leaking from host to container

I’m not yet certain this works, but it’s what I read here.

Run:
```
cp /lib/systemd/system/lxcfs.service /etc/systemd/system/
```
(libpam-cgfs is mentioned in /usr/share/doc/lxcfs/README.Debian.)

Edit /etc/systemd/system/lxcfs.service and change:

ExecStart=/usr/bin/lxcfs --enable-loadavg --enable-cfs --enable-pidfd --enable-cgroup /var/lib/lxcfs

Run:

systemctl daemon-reload
systemctl restart lxcfs

Hypervisor plugins

In order to support mounting disk images on unprivileged containers, a plugin is needed.

Run:

mkdir -p ~/opt/
svn co -q https://svn.pasta.freemyip.com/main/virttools/trunk ~/opt/virttools
mkdir -p /etc/libvirt/hooks/lxc.d
ln ~/opt/virttools/bin/etc-libvirt-hooks-lxc.d-mounter /etc/libvirt/hooks/lxc.d/mounter

(Notice that ln command does not use the -s option.)

Virtual resources

Create and define the local storage pool:

mkdir /vol/local/images
virsh pool-define-as --name=local   --type=dir \
    --target=/vol/local/images
virsh pool-start local
virsh pool-autostart local

Copy any ISO images you might need into that pool, e.g.:

scp /pub/computing/software/iso-images/os/debian-trixie-DI-alpha1-amd64-netinst.iso trofie:/vol/local/images/

and then run:

virsh pool-refresh local
virsh vol-list local

Remove pre-defined but unwanted networks:

virsh net-destroy default
virsh net-undefine default

Remove pre-defined but unwanted storage pools:

virsh pool-destroy default
virsh pool-undefine default

Since we plumb VMs’ NICs directly into the sharable br0 bridge and br0 is not managed by libvirt there is nothing to do at this time to configure access to the public network.

Regardless of whether the machine is part of a cluster or not, define a network to allow co-hosted VMs to communicate with each other directly, e.g.:

virsh net-define <(cat <<EOF
<network>
  <name>192.168.10.0</name>
  <uuid>$(uuidgen)</uuid>
  <bridge name='virbr0' stp='on' delay='0'/>
  <mac address='52:54:00:81:cd:08'/>
  <ip address='192.168.10.1' netmask='255.255.255.0'>
  </ip>
</network>
EOF
)
virsh net-autostart 192.168.10.0
virsh net-start 192.168.10.0

(This is mainly so that NFS mounts don’t need to leave the machine if they don’t need to.)

Set up SSH keys to allow the running of virt-manager from a remote system.

Tips & tricks

To connect to VM’s graphical console it is possible to use something like this:

virt-viewer --connect=qemu+ssh://root@torchio/system lagane
 # or set LIBVIRT_DEFAULT_URI and drop the '--connect=...'

Introduction

Hardware

Local storage

Replicated storage

Shared public network interface

Dedicated network interface for cluster communications

Hypervisors

Workaround for load average leaking from host to container

Hypervisor plugins

Virtual resources

Tips & tricks

See also