Introduction
This page describes how Alexis Huxley installed and configures a replicated storage and virtualisation environment providing VM services and NFS services. This is pretty much the same as Configuring storage and virtualisation services generation three point two but I finally gave up on OCFS2 and dual-primary DRBD and their glitching problem.
Hardware
- two systems are required
- RAID1 is not needed (redundancy will be provided by DRBD)
- optionally RAID0 over multiple disks in each server (RAID0 will help improve IO speeds)
- two physical NICs are required in each host
- NIC names must per persistent (either by using modern naming or by appropriate entries in /etc/udev/rules.d/70-persistent-net.rules)
Local storage
Virtualisation servers will use DRBD-replicated storage for most VMs. However, occasionally, local space is useful (e.g. for a test VM).
- Create LVs:
lvcreate --name=local --size=200g vg0
- Format for XFS, which offers online size changing:
mkfs -t xfs -f /dev/vg0/local
- Add fstab entries for them all as below, create mountpoints and mount them:
/dev/mapper/vg0-local /vol/local xfs auto,noatime,nodiratime 0 2
Shared public network interface
Next time I do this, I should look to see if the bridging should be done with virt-manager, since it otherwise does not see the public network as one that it manages.
The VMs that run on each node will need access to the public network interface.
- Reconfigure the stanza for etho in /etc/network/interfaces accordingly. E.g.:
iface eth0 inet manual auto br0 iface br0 inet static address 192.168.1.6 netmask 255.255.255.0 gateway 192.168.1.1 bridge_ports eth0
Dedicated network interface for cluster communications
It is essential to use a dedicated network card for cluster communications in order to ensure that public traffic does not impact replication.
- Add a suitable entry to /etc/network/interfaces for the NIC you will use for the cluster communications and add an entry for it to /etc/hosts. E.g.:
auto eth1 iface eth1 inet static address 192.168.3.6 netmask 255.255.255.0
- Reboot.
Creating replicated volumes for VMs
Note that this cannot be done from within virt-manager because virt-manager does not support DRBD-based storage.
- On one of node of a DRBD/VM cluster (i.e. fiori or torchio) download create-perfect-drbd-vol, make it executable and run it (be sure to specify IPs on replication network). Either use its
--run-remote
option or observe the requirement to run it a second time on the other node too.
Hypervisors
This procedure is to be run on both nodes, regardless of whether they are both being configured at the same time or not, unless explicitly stated otherwise.
- Run:
apt-get install qemu-kvm libvirt-bin qemu-utils virt-top
- Create and define the local storage pool:
mkdir /vol/local/vmpool0 virsh pool-define-as --name=vmpool0 --type=dir \ --target=/vol/local/vmpool0 virsh pool-start vmpool0
- If you have access to ISO images over NFS then define a storage pool for that:
apt-get -y install nfs-common # /srv is not the right mountpoint container, but then where is? mkdir /srv/isoimages virsh pool-define-as isoimages netfs --source-host 192.168.1.28 \ --source-path /vol/pub/computing/software/iso-images/os \ --target /srv/isoimages virsh pool-start isoimages
(Beware that if the NFS server itself is a VM, then you probably do not want to set autostart on this pool.)
- Remove unwanted pools:
virsh pool-destroy default virsh pool-undefine default # unfortunately not persistent
- Register the public network:
# this is not needed because the bridge is managed by the OS, not by libvirt.
- Define a network to allow co-hosted VMs to communicate with each other directly, e.g.:
virsh net-destroy default # storage network for VMs (see above) virsh net-define <(cat <<EOF <network> <name>$192.168.10.0</name> <uuid>$(uuidgen)</uuid> <bridge name='virbr0' stp='on' delay='0'/> <mac address='52:54:00:81:cd:08'/> <ip address='192.168.10.1' netmask='255.255.255.0'> </ip> </network> EOF )
- Set up SSH keys to allow the running of virt-manager from a remote system.
- If you have existing VM images and definitions to migrate, then migrate them now.
Procedure to create a VM
- Create a replicated volume for the VM’s OS, as described above, naming the volume after the VM (e.g. fusilli).
- Create replicated volumes for large data areas the VM will need (e.g. web pages, mail repository, p2p downloads), prefixing the naming the volume name with the name of the VM (e.g. fusilli_web, fusilli_mail, fusilli_p2p).
- Use virt-manager to define the VM, attach the storage and attach the VM’s NICs to the appropriate networks (normally two).
NFS services from a VM
This procedure is to be run on a single VM, not on the virtualisation servers!
- Install a VM.
- Create a suitably sized replicated volume to store the data to be shared via NFS, as described above, attach it to the VM, within the VM format it for use by LVM, create a suitably sized LV, format it and mount it (e.g. at /vol/pub).
- Due to Ubuntu bug 1558196, run the following commands:
systemctl add-wants multi-user.target rpcbind.service
(See https://askubuntu.com/questions/771319/in-ubuntu-16-04-not-start-rpcbind-on-boot for more details.)
- If an NFS client is a VM and it is running on the same physical host as the NFS server, then some performance increase can be gained by directing the NFS client to that NIC on the the NFS server that is on the shared virtual network. Therefore:
- Ensure the VM has a second interface connected to the virtual network that was created in the virtualisation servers earlier. For the sake of this procedure, let’s assume that the network is 192.168.10.0/24 and the servers will be 192.168.10.28.
- Note that, later, when creating other VMs:
- they will also need a second interface connected to the virtual network that was created in the virtualisation servers earlier.
- they should attempt to mount the NFS share first using the NFS server’s second interface and the fall back to the NFS server’s first interface, as in this example automounter entry:
pub -nordirplus,noatime,nodiratime,nfsvers=3,proto=tcp filer.pasta.net,fettuce.pasta.net:/vol/pub
- Write a suitable /etc/exports file, with exports accessible both over the public network and the “co-hosted VMs” network. As an example here is my own:
/vol/small/home 192.168.1.0/24(rw,sync,no_root_squash,no_subtree_check) 192.168.10.9(rw,sync,no_root_squash,no_subtree_check) /vol/small/home 192.168.1.0/24(rw,sync,no_root_squash,no_all_squash,no_subtree_check) 192.168.10.9(rw,sync,no_root_squash,no_subtree_check) /vol/pub 192.168.1.0/24(rw,sync,no_root_squash,no_all_squash,no_subtree_check) 192.168.10.9(rw,sync,no_root_squash,no_subtree_check) /vol/small/home 192.168.1.8(ro,no_root_squash,no_subtree_check) 192.168.10.8(ro,no_root_squash,no_subtree_check) /vol/pub 192.168.1.8(ro,no_root_squash,no_subtree_check) 192.168.10.8(ro,no_root_squash,no_subtree_check) /vol/small/svn 192.168.1.8(rw,sync,no_root_squash,no_all_squash,no_subtree_check) 192.168.10.8(rw,sync,no_root_squash,no_all_squash,no_subtree_check) /vol/small/mail 192.168.1.29(rw,sync,no_root_squash,no_all_squash,no_subtree_check) 192.168.10.29(rw,sync,no_root_squash,no_all_squash,no_subtree_check)
- Run:
exportfs -av
- Note for NFS clients:
- For reasons I don’t understand, when I try to ‘svn commit’ then the NFS server logs:
lockd: cannot monitor <web-server-hostname>
The only fix I’ve been able to find for this is to include the following in the NFS client’s mount options (or in the auto.staging map):
...,nolock,...
- Clients should attempt to mount the NFS share first using the NFS server’s interface on the “co-hosted VMs” network and the fall back to the NFS server’s public interface, as in this example automounter entry:
pub -nordirplus,noatime,nodiratime,nfsvers=3,proto=tcp filer.pasta.net,fettuce.pasta.net:/vol/pub
- For reasons I don’t understand, when I try to ‘svn commit’ then the NFS server logs:
SMB services from a VM
SMB is useful for allowing smartphones, Windows and Mac machines to transfer files (e.g. to put MP3s onto a smartphone).
- Run:
apt-get install samba
- Convert Unix accounts to SMB accounts as follows:
# pdbedit seems to have no way to pre-lock accounts so we'll use secure passwords pwgen() { dd if=/dev/urandom bs=1 count=100 2>/dev/null | base64 -w0; } # we'll need to extract login and fullname from entries in /etc/passwd or getent fanoutpwent() { perl -pe 's/^([^:]*):([^:]*):([^:]*):([^:]*):([^,]*),([^,]*),([^,]*),([^,]*):([^:]*):([^:]*)\n/"$1" "$2" "$3" "$4" "$5" "$6" "$7" "$8" "$9" "$10"\n/g;' <<<"$1"; } # generic function to run a shell command after getting ok to run it shi() { while read -r X; do eval set -- "$X"; read -p "$1: " YESNO < /dev/tty; [ "X$YESNO" != Xy ] || eval "$2"; done; } UID_MIN=$(sed -n 's/^UID_MIN[\t ]*//p' /etc/login.defs) UID_MAX=$(sed -n 's/^UID_MAX[\t ]*//p' /etc/login.defs) getent passwd | awk -F: "{ if ( \$3 >= $UID_MIN && \$3 <= $UID_MAX ) { print } }" | while read PWENT; do \ eval set -- $(fanoutpwent "$PWENT"); P=$(pwgen); echo "$1 '{ echo \"$P\"; echo \"$P\"; } | pdbedit --create --user "$1" --fullname \"$5\" --password-from-stdin'" done | shi
and follow the prompts regarding which accounts to create.
- Edit /etc/samba/smb.conf and set:
[global] ... # See http://www.spinics.net/lists/samba/msg69479.html strict locking = no # this doesn't work so don't bother uncommenting it #hide dot files = yes ... [homes] ... read only = no # this doesn't work so don't bother uncommenting it #hide dot files = yes ... [pub] comment = Public Archive browsable = yes path = /pub/ #[printers] #... #[print$] #...
- Run:
service samba reload
- Try to connect from a SMB client using smbclient as follows:
- Edit /etc/samba/smb.conf on the client and change:
syslog = 0
to
logging = syslog@0
(Without this you will see the warning message ‘WARNING: The “syslog” option is deprecated’. Note also that there is no need to make this change on the SMB server.)
- Run:
smbclient '\\fettuce\pub'
and
smbclient '\\fettuce\alexis'
- Edit /etc/samba/smb.conf on the client and change: