Configuring virtualisation services (revision 6)

Introduction

This page describes how Alexis Huxley installed and configures a virtualisation environment providing KVM-based VMs and LXC-based containers and replicated VM/container images. This procedure is intended for Debian 13.

Note that NAS services are provided by a KVM-based VM and are documented elsewhere.

Hardware

  • two systems are required
  • optionally RAID0 over multiple disks in each server to improve disk access speeds
  • two NICs are required in each host (one for host access and one for replication)

Local storage

Virtualisation servers will use DRBD-replicated storage for most VMs. However, occasionally, local space is useful (e.g. for a test VM, for snapshotting DRBD devices).

  1. If the system was installed with PCMS then skip this section as it will already have been done by PCMS.
  2. For test virtualisation machines (e.g. pestaroli, testatoli, trofie), local storage should be set up manually.
  3. Create LVs:
    lvcreate --name=local --size=200g vg0
    
  4. Format:
    mkfs -t ext4 /dev/vg0/local
  5. Add fstab entries for them all as below, create mountpoints and mount them:
    /dev/mapper/vg0-local /vol/local ext4 auto,noatime,nodiratime 0 2

Replicated storage

  1. Run:
    apt -y install drbd-utils
  2. Even when using ZFS, it’s likely that LVM is still installed. By default, LVM finds PVs inside VM images. To prevent this edit /etc/lvm/lvm.conf and set:
    devices {
        ...
        filter = [ "r|/dev/drbd.*|", "r|/dev/vg.*|", "r|/dev/zd.*|" ]
        ...
    }

    (It is not necessary to reboot after that edit.)

Shared public network interface

The VMs that run on each node will need access to the public network interface.

  1. If the system was installed with PCMS then skip this section as it will already have been done by PCMS.
  2. Reconfigure the stanza for the first NIC in /etc/network/interfaces accordingly. E.g.:
    iface eth0 inet manual
    
    auto br0
    iface br0 inet static
        address 192.168.1.10
        netmask 255.255.255.0
        gateway 192.168.1.1
        bridge_ports <public-interface>

Dedicated network interface for cluster communications

It is essential to use a dedicated network card for cluster communications in order to ensure that public traffic does not impact replication.

  1. If the system was installed with PCMS then skip this section as it will already have been done by PCMS.
  2. If the system is not part of a cluster then skip this section.
  3. Add a suitable entry to /etc/network/interfaces for the NIC you will use for the cluster communications and add an entry for it to /etc/hosts. E.g.:
    auto <replication-interface>
    iface eth1 inet static
        address 192.168.3.10
        netmask 255.255.255.0
  4. Reboot.

Hypervisors

This procedure is to be run on both nodes, regardless of whether they are both being configured at the same time or not, unless explicitly stated otherwise.

  1. If the system was installed with PCMS then skip this section as it will already have been done by PCMS.
  2. Run:
    #  basic libvirt infrastructure
    apt -y install uuid-runtime netcat-openbsd libvirt-daemon \
                   libvirt-daemon-system libvirt-clients
    #  KVM/QEMU support
    apt -y install qemu-system-x86 qemu-utils ovmf
    #  LXC support
    apt -y install lxc lxcfs libpam-cgfs lxc-templates libfile-lchown-perl \
                   libvirt-daemon-driver-lxc xmlstarlet debootstrap \
                   distro-info fuse3
    

    and reboot.
    (Previously dnsmasq had appeared in this list but I now believe that it’s better not to install it, as it interferes with lxc-net.)

  3. By default, the lxc-net service attempts to start an IPv6 virtual switch, which, in an IPv4-only environment, will fail. Fix this as follows:
    1. Add the following to /etc/default/lxc-net:
      LXC_IPV6_ADDR=
      LXC_IPV6_MASK=
      LXC_IPV6_NETWORK=
      LXC_IPV6_NAT=
    2. Add the following to /etc/default/lxc-net:
      USE_LXC_BRIDGE=false
    3. Run:
      systemctl restart lxc-net

Workaround for load average leaking from host to container

I’m not yet certain this works, but it’s what I read here.

  1. Run:
    cp /lib/systemd/system/lxcfs.service /etc/systemd/system/

    (libpam-cgfs is mentioned in /usr/share/doc/lxcfs/README.Debian.)

  2. Edit /etc/systemd/system/lxcfs.service and change:
    ExecStart=/usr/bin/lxcfs --enable-loadavg --enable-cfs --enable-pidfd --enable-cgroup /var/lib/lxcfs
  3. Run:
    systemctl daemon-reload
    systemctl restart lxcfs
    

Hypervisor plugins

In order to support mounting disk images on unprivileged containers, a plugin is needed.

  1. Run:
    mkdir -p ~/opt/
    svn co -q https://svn.pasta.freemyip.com/main/virttools/trunk ~/opt/virttools
    mkdir -p /etc/libvirt/hooks/lxc.d
    ln ~/opt/virttools/bin/etc-libvirt-hooks-lxc.d-mounter /etc/libvirt/hooks/lxc.d/mounter

    (Notice that ln command does not use the -s option.)

Virtual resources

  1. Create and define the local storage pool:
    mkdir /vol/local/images
    virsh pool-define-as --name=local   --type=dir \
        --target=/vol/local/images
    virsh pool-start local
    virsh pool-autostart local
    
  2. Copy any ISO images you might need into that pool, e.g.:
    scp /pub/computing/software/iso-images/os/debian-trixie-DI-alpha1-amd64-netinst.iso trofie:/vol/local/images/

    and then run:

    virsh pool-refresh local
    virsh vol-list local
    
  3. Remove pre-defined but unwanted networks:
    virsh net-destroy default
    virsh net-undefine default
  4. Remove pre-defined but unwanted storage pools:
    virsh pool-destroy default
    virsh pool-undefine default
  5. Since we plumb VMs’ NICs directly into the sharable br0 bridge and br0 is not managed by libvirt there is nothing to do at this time to configure access to the public network.
  6. Regardless of whether the machine is part of a cluster or not, define a network to allow co-hosted VMs to communicate with each other directly, e.g.:
    virsh net-define <(cat <<EOF
    <network>
      <name>192.168.10.0</name>
      <uuid>$(uuidgen)</uuid>
      <bridge name='virbr0' stp='on' delay='0'/>
      <mac address='52:54:00:81:cd:08'/>
      <ip address='192.168.10.1' netmask='255.255.255.0'>
      </ip>
    </network>
    EOF
    )
    virsh net-autostart 192.168.10.0
    virsh net-start 192.168.10.0

    (This is mainly so that NFS mounts don’t need to leave the machine if they don’t need to.)

  7. Set up SSH keys to allow the running of virt-manager from a remote system.

Tips & tricks

  1. To connect to VM’s graphical console it is possible to use something like this:
    virt-viewer --connect=qemu+ssh://root@torchio/system lagane
     # or set LIBVIRT_DEFAULT_URI and drop the '--connect=...'

See also