Last updated: 03/11/2021
Introduction
This page is an activity plan/log for migrating my systems from Debian 10 to Debian 11.
Project log
- prologue:
- order more memory? (skipped; too expensive)
- order a second 8TB disk? (done)
- await delivery of disks (done)
- download Debian 11.1 netinst image from here and archive it (done)
- proof of concept (done)
- generate a new keypair for pestaroli & testaroli and archive somewhere accessible (done)
- migrate mafalde from pestaroli to testaroli (done)
- on testaroli disconnect (not down!) DRBD devices from network (so that when I come to upgrade pestaroli then testaroli doesn’t immediately try to to connect/sync) (done)
- changes to disk arrangement (done)
- on torchio and fiori add 4 new DRBD devices called pestaroli-new-disk1, pestaroli-new-disk2, testaroli-new-disk1, testaroli-new-disk2, all of size 60GB (these will become the OS+containers space for the reinstalled versions of pestaroli and testaroli) and wait for the synchronisation to complete (done)
root@fiori# create-perfect-drbd-vol -v testaroli_new_disk1 20 fiori 192.168.3.6 torchio 192.168.3.7 root@fiori# create-perfect-drbd-vol -v testaroli_new_disk2 20 fiori 192.168.3.6 torchio 192.168.3.7 root@fiori# create-perfect-drbd-vol -v pestaroli_new_disk1 20 fiori 192.168.3.6 torchio 192.168.3.7 root@fiori# create-perfect-drbd-vol -v pestaroli_new_disk2 20 fiori 192.168.3.6 torchio 192.168.3.7
- shutdown pestaroli (done)
- on fiori and torchio down DRBD devices drbd_pestaroli and drbd_pestaroli_containers (done)
- on fiori and torchio rename the drbd_pestaroli and drbd_pestaroli_containers LVs to add ‘_old’ suffix (done)
- on fiori and torchio down DRBD devices drbd_pestaroli_new_disk1 and drbd_pestaroli_new_disk2 (done)
- on fiori and torchio rename the drbd_pestaroli_new_disk1 and drbd_pestaroli_new_disk2 LVs to remove the ‘_new’ suffix (done)
- on fiori and torchio accordingly adjust the DRBD config files names and contents for DRBD devices drbd_pestaroli_new_disk1 and drbd_pestaroli_new_disk2 (done)
- on fiori and torchio up the DRBD devices drbd_pestaroli_new_disk1 and drbd_pestaroli_new_disk2 (done)
- on fiori or torchio promote to primary the DRBD devices drbd_pestaroli_new_disk1 and drbd_pestaroli_new_disk2 (done)
- notes about VMs meant to support nested VMs (done)
- to run nested VMs it is essential the VM’s CPU is declared correctly, but virt-manager’s CPU configuration menu does not include the right options. So it can only be done by editing the XML with
virsh edit <domain>
. The required CPU stanza is this: (done)<cpu mode='host-passthrough' check='partial'/>
(There are plenty of web pages saying to set ‘Copy host CPU configuration’ option, but these did not work for me.)
- to check support for nested VMs verify the output of the following commands: (done)
root# egrep -q "vmx|svm" "/proc/cpuinfo" && echo OK OK root# cat /sys/module/kvm_amd/parameters/nested 1 root# virt-host-validate qemu QEMU: Checking for hardware virtualization : PASS ... root#
- to run nested VMs it is essential the VM’s CPU is declared correctly, but virt-manager’s CPU configuration menu does not include the right options. So it can only be done by editing the XML with
- undefine the pestaroli VM (done)
- redefine the pestaroli VM by running:
create-perfect-kvm-vm -v -n --name=pestaroli --mem=4 --nics=2 --disk=block:/dev/drbd_pestaroli_disk1 --disk=block:/dev/drbd_pestaroli_disk2 --remote=<other-vm-server>
- note partitioning within the pestaroli VM with its two disks: we choose to use LVM-over-MD-RAID0 because: (done)
- although the Debian installer supports creating multi-disk VGs, it does not support creating striped LVs
- according to this article MD-RAID0 provides slightly better performance than striped LVs
so we go for:
- /dev/vda1: 550MB for EFI
- /dev/vda2: 1024MB for /boot (general advice is keep /boot out of LVM)
- /dev/vda3: remainder for MD-RAID0
- /dev/vdb1: 550MB for nothing except symmetry
- /dev/vdb2: 1024 for nothing except symmetry
- /dev/vdb3: remainder for MD-RAID0
- /dev/md0: remainder for LVM
- create VG vg0 on /md0
- create LV root
- on torchio and fiori add 4 new DRBD devices called pestaroli-new-disk1, pestaroli-new-disk2, testaroli-new-disk1, testaroli-new-disk2, all of size 60GB (these will become the OS+containers space for the reinstalled versions of pestaroli and testaroli) and wait for the synchronisation to complete (done)
- reinstall according to this procedure (installing and running PCMS is included in that procedure) and update it based on the partition notes above (done)
- running PCMS should nave been enough to set up ~root/.ssh/authorized_keys but ~/.ssh/id_rsa* are still missing; restore them (done)
- Set testaroli up as a VM server by completing Configuring storage and virtualisation services generation three point four (but ignore certain bits which are only relevant in the production environment on fiori and torchio). (done)
- for mafalde: (done)
- create one-legged DRBD devices on pestaroli using a higher port number so synchronisation cannot start: (done)
mkdir -p ~/opt svn -q co https://svn.pasta.freemyip.com/main/virttools/trunk ~/opt/virttools create-perfect-drbd-vol -n mafalde 5 pestaroli 192.168.3.32 testaroli 192.168.3.31
and archive the file script that that produces, so that we can run it later on the other DRBD node.
- make the device primary on pestaroli (done)
- shutdown the VM on testaroli (done)
- copy over ssh keys to allow two way communication between pestaroli and testaroli
- dd the DRBD device from testaroli to pestaroli: (done)
# on pestaroli ssh -n testaroli dd if=/dev/drbd_mafalde bs=1M | dd of=/dev/drbd_mafalde bs=1M
- copy over the VM configuration: (done)
# on pestaroli virsh --connect=lxc:/// define <(ssh -n testaroli virsh --connect=lxc:/// dumpxml mafalde)
- start the VM on pestaroli (done)
- create one-legged DRBD devices on pestaroli using a higher port number so synchronisation cannot start: (done)
- confirm: mafalde is now running on pestaroli, right? (done)
- reinstall testaroli: (done)
- undefine the pestaroli VM (done)
- redefine the pestaroli VM by running:
create-perfect-kvm-vm -v -n --name=pestaroli --mem=4 --nics=2 --disk=block:/dev/drbd_pestaroli_disk1 --disk=block:/dev/drbd_pestaroli_disk2 --remote=<other-vm-server>
- reinstall according to this procedure (installing and running PCMS is included in that procedure) (done)
- running PCMS should nave been enough to set up ~root/.ssh/authorized_keys but ~/.ssh/id_rsa* are still missing; restore them (done)
- Set testaroli up as a VM server (but ignore certain bits which are only relevant in the production environment on fiori and torchio).
- for mafalde: (done)
- create second leg of DRBD device on testaroli by running the script saved earlier; synchronisation should start automatically. (done)
- wait for synchronisation to finish (done)
- shutdown the VM on pestaroli (done)
- make the device secondary on pestaroli (done)
- make the device primary on testaroli (done)
- verify two way ssh between pestaroli and testaroli (done)
- copy over the VM configuration: (done)
# on testaroli virsh --connect=lxc:/// define <(ssh -n pestaroli virsh --connect=lxc:/// dumpxml mafalde)
- I needed to reboot at this point but I think a systemctl restart libvirt probably would have been enough (done)
- start the VM on testaroli (done)
- migrate it back again using my scripts (done)
- run the uefi sync script to make sure VM definitions are aligned (done)
- rebalance VMs (not necessary; listed here only for symmetry) (skipped)
- cleanup (done)
- shut down mafalde (done)
- undefine the mafalde LXC container on pestaroli and testaroli (done)
- down the DRBD device on pestaroli and testaroli (done)
- undefine the DRBD device on pestaroli and testaroli (done)
- remove the LV on pestaroli and testaroli (done)
- PCMS modifications (done)
- modify PCMS to make congane the same as mafalde is current (i.e. both should be LXC) (done)
- modify PCMS to make mafalde a KVM VM (done)
- for KVM-based VM mafalde on Debian 11 KVM-based VMs on Debian 1o PMs (in progress)
- create new 10GB DRBD volume on pestaroli and testaroli (20GB will not be possible due to size of RAID0 device on pestaroli and testaroli) (done)
create-perfect-drbd-vol -v --remote=testaroli mafalde 10 pestaroli 192.168.3.32 testaroli 192.168.3.31
- create the mafalde VM (done):
create-perfect-kvm-vm --remote=testaroli --name=mafalde --disk=block:/dev/drbd_mafalde
- install the VM as follows (failed)
- update the create-perfect-kvm-vm script for Debian 11 (done, but then reverted in case caused problem below)
- change video to virtio (done, but then reverted in case caused problem below)
- change OS to Debian 11 (done, but then reverted in case caused problem below, see also BTS#997928)
- complete this procedure (failed, after grub menu, mafalde VM uses 100% of CPU of pestaroli; several bug reports suggest this is due to different kernels at L0, L1, L2 level)
- update the create-perfect-kvm-vm script for Debian 11 (done, but then reverted in case caused problem below)
- create new 10GB DRBD volume on pestaroli and testaroli (20GB will not be possible due to size of RAID0 device on pestaroli and testaroli) (done)
- for KVM-based VM mafalde on Debian 11 PM as possible fix for above problem (done)
- commit all working copies in ~root/opt/ on all recently used hosts (done)
- modify PCMS to make halusky a VM server (done)
- reinstall halusky according to this procedure (installing and running PCMS is included in that procedure) (done)
- configure halusky as a VM server (but ignore certain bits which are only relevant in the production environment on fiori and torchio) (done)
- in order to better understand how to use
virsh create-vol-as
, I installed virt-manager (done) - on halusky create new 20GB volume in the default pool for mafalde (done)
virsh vol-create-as --pool local --name mafalde.img --capacity 20G --format raw
- on halusky create the mafalde VM (done)
IMG_PATH=$(virsh -q vol-path --pool local --vol mafalde.img) ./create-perfect-kvm-vm -v --name=mafalde --disk=file:$IMG_PATH
- install mafalde according to this procedure (installing and running PCMS is included in that procedure) (done)
- note that at this point we are confident that when we come to reinstall fiori and torchio we will be able to run VMs on them (done)
- if okay then modify create-perfect-kvm-vm to use virtio video hardware (done)
- if okay then modify create-perfect-kvm-vm to define the VM as Debian 11 (done)
- configure mafalde as a VM server (but ignore certain bits which are only relevant in the production environment on fiori and torchio) (done)
- install virt-manager (done)
- on mafalde create new 5GB volume in the default pool for mafalde2 (done)
virsh vol-create-as --pool local --name mafalde2.img --capacity 5G --format raw
- on mafalde create the mafalde2 VM
IMG_PATH=$(virsh -q vol-path --pool local --vol mafalde2.img) ./create-perfect-kvm-vm -v --name=mafalde2 --disk=file:$IMG_PATH
- start (but don’t bother completing) this procedure to install mafalde2 (done)
- note – solely for completeness – that at this point we are confident that when L0 and L1 and L2 are all running the same kernel then nested VMs work (done)
- put halusky away for the time being (done)
- for LXC-based VM congane (done)
- create new 5GB DRBD volume on pestaroli and testaroli for congane (done)
create-perfect-drbd-vol -n --remote=testaroli --name=congane --size=5 --name1=pestaroli --ip1=192.168.3.32 --name2=testaroli --ip2=192.168.3.31
(Note that I modified create-perfect-drbd-vol to take all arguments as options.)
- install new LXC container with the same name according to this procedure (which should include installing and running PCMS) (done)
- create new 5GB DRBD volume on pestaroli and testaroli for congane (done)
- epilogue (done)
- remove obsolete DRBDs and LVs (done)
- commit everything (done)
- for halusky (done)
- modify PCMS to make it a laptop (done)
- reinstall halusky according to its own page (done)
- commit edits to pcms (done)
- restore non-dot files from backup of home (done)
- configure backups (done)
- do a backup (done)
- review this section and apply anything necessary to the fiori/torchio procedures below (done)
- prologue (done)
- fix get_iplayer location issue (done)
- modify PCMS to be able to list (skipped; difficult to integrate and there is an existing solution)
- make a new release of PCMS (done)
- remove the workaround for pcms version 2 in installing and running pcms page (done)
- decide when to do fiori (done)
- reinstall fiori (done)
- migrate all VMs from fiori to torchio (done)
- on fiori commit anything I need to commit (done)
- on fiori run
vm-server down
(done) - on torchio disconnect (not down!) DRBD devices from network (so that when I come to upgrade fiori then torchio doesn’t immediately try to to connect/sync) (done)
- on fiori power off (done)
- on fiori remove the 4TB disk and and insert one of the new 8TB disks (done)
- on sugo configure fiori in PCMS as KVM+LXC server (but not torchio else it will be applied before torchio is reinstalled) (done)
- edit fiori’s wordpress page and install it according to its own page (making edits along the way) (done)
- retrieve the standard working copies for ~root/opt (done)
- copy ssh keys from torchio to fiori (done)
- adjust the port range that create-perfect-drbd-vol uses so that new devices on this node can’t talk to old devices on the other node but do not commit yet (done)
- The following procedure will be used for each VM to migrate (not applicable)
- set up some environment variables:
VM=<vm-name> DRBD_DEVICES=( $(ssh -q torchio virsh domblklist $VM | sed -nr 's@.*/dev/drbd_(.*)@\1@p') )
- create one-legged DRBD devices on fiori:
# on fiori for DRBD_DEVICE in "${DRBD_DEVICES[@]}"; do create-perfect-drbd-vol --name=$DRBD_DEVICE --size=20 --name1=fiori --ip1=192.168.3.6 --name2=torchio --ip2=192.168.3.7 drbdadm disconnect drbd_$DRBD_DEVICE done
(There is no point in it attempting to connect that the moment.)
- Archive the script that create-perfect-drbd-vol created, making sure to include the VM name in the name of the file (this file will be needed later on torchio).
- shutdown the VM on torchio
- test the ssh connection between fiori and torchio
- copy over the VM configuration to fiori from torchio:
# on fiori virsh define <(ssh -n torchio virsh dumpxml $VM) scp torchio:/var/lib/libvirt/qemu/nvram/${VM}_VARS.fd /var/lib/libvirt/qemu/nvram/
- dd the DRBD devices to fiori from torchio:
# on fiori for DRBD_DEVICE in "${DRBD_DEVICES[@]}"; do ssh -n torchio dd if=/dev/drbd_$DRBD_DEVICE bs=1M | dd of=/dev/drbd_$DRBD_DEVICE bs=1M done
- start the VM on fiori:
virsh start $VM
- set up some environment variables:
- Do it for:
- penne (done)
- lagane (done)
- vermicelli (done)
- rombi (done)
- pestaroli (done)
- testaroli (done)
- anelli (done)
- ziti (done)
- barbine (done)
- nuvole (done)
- marille (done)
- stortini (done)
- on fiori copy over the OS disk only and do not turn on! (done)
- copy over the VM configuration (done)
- change boot order so that it boots from an installation/rescue medium (done)
- change the NICs’ MAC addresses (done)
- chroot into disk (done)
- on new stortini disable pcms (done)
- on new stortini comment out the large devices and swap from fstab (done)
- on new stortini change the IPs in /etc/network/interfaces (done)
- on new stortini shutdown (done)
- change boot order so it boots from hard disk (done)
- log into the new stortini (done)
- format it’s large devices (done)
- uncomment the large devices in fstab (done)
- mount the large devices (done)
- mount swap (done)
- reboot the new stortini (done)
- create temporary ssh keys (done)
- start rsyncs in series over the backlan (done)
- critical period starts here (done)
- save all edits in web browser and exit as much as possible (done)
- shutdown the mail server (done)
- repeat rsyncs (done)
- on torchio shutdown the old stortini (done)
- on new stortini fix the IPs in /etc/network/interfaces (27—>26) (done)
- on new stortini reboot (done)
- ssh into stortini (which is the new one) (done)
- on new stortini remove the temporary ssh keys (done)
- on new stortini enable PCMS (done)
- to work around the modified /etc/network/interfaces run
pcms -- -t
(done) - on torchio drbdadm down the old stortini’s DRBD devices (done)
- start up the mail server (done)
- critical period ends here (done)
- confirm: all VMs are now running on fiori, right? (done)
- commit any changes made to ~root/opt (done)
- fix the file/block problem in the VMs’ XMLs (done)
- make a new PCMS (done)
- decide a weekend to do torchio (done: immediately)
- reinstall torchio: (done)
- on torchio commit anything I need to commit (done)
- remove all, LVs, VGs, PVs (done)
- power off (done)
- on torchio remove the 4TB disk and and insert one of the new 8TB disks (done)
- on sugo configure torchio in PCMS as KVM+LXC server (done)
- edit torchio’s wordpress page and install it according to its own page (making edits along the way) (done)
- retrieve the standard working copies for ~root/opt (done)
- generate a new keypair for torchio and fiori, verify two way ssh between fiori and torchio and update pcms-config accordingly (done)
- The following procedure will be used for each VM (not applicable)
- create second leg of DRBD device on torchio by running the script saved earlier and on fiori running
drbadm connect <drbd-device>
- copy over the VM configuration to torchio from fiori by running this on torchio:
VM=<vm-name> virsh define <(ssh -n fiori virsh dumpxml $VM) scp fiori:/var/lib/libvirt/qemu/nvram/${VM}_VARS.fd /var/lib/libvirt/qemu/nvram/
- wait for synchronisation to finish
- migrate the VM by running:
vm-migrate fiori torchio $VM
- create second leg of DRBD device on torchio by running the script saved earlier and on fiori running
- Do it for:
- penne (done)
- lagane (done)
- vermicelli (done)
- rombi (done)
- pestaroli (done)
- testaroli (done)
- anelli (done)
- ziti (done)
- barbine (done)
- nuvole (done)
- marille (done)
- (decide when to do it for stortini and wait until that time) (done)
- stortini (done)
- rebalance VMs (pending)
- erase the two 4TB disks I removed (done)
- jump a little below to do one KVM VM replacement
- jump a little below to and one LXC VM replacement (done)
- do work machine
- do delguine
- for cercis:
- review my upgrade instructions for Judith (in openproject, I think) and copy some of it to this sub-procedure
- prepare USB stick (done)
- print relevant pages from here, including this list
- add that USB stick and printouts to my packing list
- wait till 21/12/2021
- do cercis
- new key pair (done)
- generate a new keypair for fiori & torchio (done)
- archive somewhere accessible (skipped)
- copy it to both machines (done)
- on sugo update PCMS to use that public key for torchio/fiori access (PCMS doesn’t distribute the key pair so I have to do that later, but it does manage the ~/.ssh/authorized_keys) (done)
- The following procedure will be used for each VM:
- decide if KVM or LXC or decommission
- decide on a new name
- create DRBD volume(s)
- create its wordpress page
- create entry on the Computing page but don’t move the service description tag from next to the old system’s name to next to the new system’s name
- install new VM according to this procedure
- shutdown services on old VM?
- configure and start services on new VM and update the service page with any changes
- on the Computing page move the service description tag from next to the old system’s name to next to the new system’s name
- make sure that the new VM is being backed up
- power off the old VM
- Do it for:
- ziti
- lagane
- vermicelli
- rombi
- pestaroli
- testaroli
- anelli
- penne
- barbine
- nuvole
- marille (done)
- (decide when to do it for stortini and wait until that time)
- stortini
- wait a month
- on the Installing Debian 11 o on 21/12/2021n a PM or KVM VM page, there is some stuff I have noted can be deleted
- for each VM:
- remove backups of old VM
- remove disk images of old VM
- virsh undefine old VM
- close ticket #159