The libvirt library is used to interface with different virtualisation technologies. Before getting started with libvirt it is best to make sure your hardware supports the necessary virtualisation extensions for KVM. Enter the following from a terminal prompt:
A message will be printed informing you if your CPU does or does not support hardware virtualisation.
On many computers with processors supporting hardware-assisted virtualisation, it is necessary to first activate an option in the BIOS to enable it.
There are a few different ways to allow a virtual machine access to the external network. The default virtual network configuration includes bridging and iptables rules implementing usermode networking, which uses the SLiRP protocol. Traffic is NATed through the host interface to the outside network.
To enable external hosts to directly access services on virtual machines a different type of bridge than the default needs to be configured. This allows the virtual interfaces to connect to the outside network through the physical interface, making them appear as normal hosts to the rest of the network.
To install the necessary packages, from a terminal prompt enter:
sudo apt update sudo apt install qemu-kvm libvirt-daemon-system
libvirt-daemon-system, the user that will be used to manage virtual machines needs to be added to the libvirt group. This is done automatically for members of the sudo group, but needs to be done in addition for anyone else that should access system-wide libvirt resources. Doing so will grant the user access to the advanced networking options.
In a terminal enter:
sudo adduser $USER libvirt
If the chosen user is the current user, you will need to log out and back in for the new group membership to take effect.
You are now ready to install a Guest operating system. Installing a virtual machine follows the same process as installing the operating system directly on the hardware.
You will need one of the following:
- A way to automate the installation.
- A keyboard and monitor attached to the physical machine.
- To use cloud images which are meant to self-initialise (see Multipass and UVTool).
In the case of virtual machines, a Graphical User Interface (GUI) is analogous to using a physical keyboard and mouse on a real computer. Instead of installing a GUI the
virt-manager application can be used to connect to a virtual machine’s console using VNC. See Virtual Machine Manager / Viewer for more information.
Virtual machine management
The following section covers the command-line tools around
virsh that are part of libvirt itself. But there are various options at different levels of complexities and feature-sets, like:
There are several utilities available to manage virtual machines and libvirt. The
virsh utility can be used from the command line. Some examples:
To list running virtual machines:
To start a virtual machine:
virsh start <guestname>
Similarly, to start a virtual machine at boot:
virsh autostart <guestname>
Reboot a virtual machine with:
virsh reboot <guestname>
The state of virtual machines can be saved to a file in order to be restored later. The following will save the virtual machine state into a file named according to the date:
virsh save <guestname> save-my.state
Once saved the virtual machine will no longer be running.
A saved virtual machine can be restored using:
virsh restore save-my.state
To shutdown a virtual machine do:
virsh shutdown <guestname>
A CDROM device can be mounted in a virtual machine by entering:
virsh attach-disk <guestname> /dev/cdrom /media/cdrom
To change the definition of a guest virsh exposes the domain via:
virsh edit <guestname>
That will allow one to edit the XML representation that defines the guest and when saving it will apply format and integrity checks on these definitions.
Editing the XML directly certainly is the most powerful way, but also the most complex one. Tools like Virtual Machine Manager / Viewer can help inexperienced users to do most of the common tasks.
vir*tools) connect to something other than the default
qemu-kvm/system hypervisor, one can find alternatives for the connect option in
man virshor the libvirt docs.
virsh - as well as most other tools to manage virtualisation - can be passed connection strings.
virsh --connect qemu:///system
There are two options for the connection.
qemu:///system- connect locally as root to the daemon supervising QEMU and KVM domains
qemu:///session- connect locally as a normal user to their own set of QEMU and KVM domains
The default was always (and still is)
qemu:///system as that is the behavior users are accustomed to.
But there are a few benefits (and drawbacks) to
qemu:///session to consider.
qemu:///session is per user and can – on a multi-user system – be used to separate the people.
Most importantly, processes run under the permissions of the user, which means no permission struggle on the just-downloaded image in your
$HOME or the just-attached USB-stick.
On the other hand it can’t access system resources very well, which includes network setup that is known to be hard with
qemu:///session. It falls back to SLiRP networking which is functional but slow and makes it impossible to be reached from other systems.
qemu:///system is different in that it is run by the global system-wide libvirt that can arbitrate resources as needed. But you might need to
chown files to the right places and change permissions to have them usable.
Applications will usually decide on their primary use-case. Desktop-centric applications often choose
qemu:///session while most solutions that involve an administrator anyway continue to default to
There are different types of migration available depending on the versions of libvirt and the hypervisor being used. In general those types are:
There are various options to those methods, but the entry point for all of them is
virsh migrate. Read the integrated help for more detail.
virsh migrate --help
Some useful documentation on the constraints and considerations of live migration can be found at the Ubuntu Wiki.
If, rather than the hotplugging described here, you want to always pass through a device then add the XML content of the device to your static guest XML representation via
virsh edit <guestname>. In that case you won’t need to use attach/detach. There are different kinds of passthrough. Types available to you depend on your hardware and software setup.
Both kinds are handled in a very similar way and while there are various way to do it (e.g. also via QEMU monitor) driving such a change via libvirt is recommended. That way, libvirt can try to manage all sorts of special cases for you and also somewhat masks version differences.
In general when driving hotplug via libvirt you create an XML snippet that describes the device just as you would do in a static guest description. A USB device is usually identified by vendor/product ID:
<hostdev mode='subsystem' type='usb' managed='yes'> <source> <vendor id='0x0b6d'/> <product id='0x3880'/> </source> </hostdev>
Virtual functions are usually assigned via their PCI ID (domain, bus, slot, function).
<hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x04' slot='0x10' function='0x0'/> </source> </hostdev>
To get the virtual function in the first place is very device dependent and can therefore not be fully covered here. But in general it involves setting up an IOMMU, registering via VFIO and sometimes requesting a number of VFs. Here an example on ppc64el to get 4 VFs on a device:
$ sudo modprobe vfio-pci # identify device $ lspci -n -s 0005:01:01.3 0005:01:01.3 0200: 10df:e228 (rev 10) # register and request VFs $ echo 10df e228 | sudo tee /sys/bus/pci/drivers/vfio-pci/new_id $ echo 4 | sudo tee /sys/bus/pci/devices/0005\:01\:00.0/sriov_numvfs
You then attach or detach the device via libvirt by relating the guest with the XML snippet.
virsh attach-device <guestname> <device-xml> # Use the Device in the Guest virsh detach-device <guestname> <device-xml>
Access QEMU Monitor via libvirt
The QEMU Monitor is the way to interact with QEMU/KVM while a guest is running. This interface has many powerful features for experienced users. When running under libvirt, the monitor interface is bound by libvirt itself for management purposes, but a user can still run QEMU monitor commands via libvirt. The general syntax is
virsh qemu-monitor-command [options] [guest] 'command'.
Libvirt covers most use cases needed, but if you ever want/need to work around libvirt or want to tweak very special options you can e.g. add a device as follows:
virsh qemu-monitor-command --hmp focal-test-log 'drive_add 0 if=none,file=/var/lib/libvirt/images/test.img,format=raw,id=disk1'
But since the monitor is so powerful, you can do a lot – especially for debugging purposes like showing the guest registers:
virsh qemu-monitor-command --hmp y-ipns 'info registers' RAX=00ffffc000000000 RBX=ffff8f0f5d5c7e48 RCX=0000000000000000 RDX=ffffea00007571c0 RSI=0000000000000000 RDI=ffff8f0fdd5c7e48 RBP=ffff8f0f5d5c7e18 RSP=ffff8f0f5d5c7df8 [...]
Using huge pages can help to reduce TLB pressure, page table overhead and speed up some further memory relate actions. Furthermore by default transparent huge pages are useful, but can be quite some overhead - so if it is clear that using huge pages is preferred then making them explicit usually has some gains.
While huge page are admittedly harder to manage (especially later in the system’s lifetime if memory is fragmented) they provide a useful boost especially for rather large guests.
When using device passthrough on very large guests there is an extra benefit of using huge pages as it is faster to do the initial memory clear on VFIO DMA pin.
Huge page allocation
Huge pages come in different sizes. A normal page is usually 4k and huge pages are either 2M or 1G, but depending on the architecture other options are possible.
The simplest yet least reliable way to allocate some huge pages is to just echo a value to
echo 256 | sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
Be sure to re-check if it worked:
cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages 256
There one of these sizes is “default huge page size” which will be used in the auto-mounted
/dev/hugepages. Changing the default size requires a reboot and is set via default_hugepagesz.
You can check the current default size:
grep Hugepagesize /proc/meminfo Hugepagesize: 2048 kB
But there can be more than one at the same time – so it’s a good idea to check:
$ tail /sys/kernel/mm/hugepages/hugepages-*/nr_hugepages` ==> /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages <== 0 ==> /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages <== 2
And even that could – on bigger systems – be further split per Numa node.
Huge pages need to be allocated by the kernel as mentioned above but to be consumable they also have to be mounted. By default,
systemd will make
/dev/hugepages available for the default huge page size.
Feel free to add more mount points if you need different sized ones. An overview can be queried with:
hugeadm --list-all-mounts Mount Point Options /dev/hugepages rw,relatime,pagesize=2M
A one-stop info for the overall huge page status of the system can be reported with:
Huge page usage in libvirt
With the above in place, libvirt can map guest memory to huge pages. In a guest definition add the most simple form of:
<memoryBacking> <hugepages/> </memoryBacking>
That will allocate the huge pages using the default huge page size from an autodetected mount point.
For more control, e.g. how memory is spread over Numa nodes or which page size to use, check out the details at the libvirt docs.
Controlling addressing bits
This is a topic that rarely matters on a single computer with virtual machines for generic use; libvirt will automatically use the hypervisor default, which in the case of QEMU is 40 bits. This default aims for compatibility since it will be the same on all systems, which simplifies migration between them and usually is compatible even with older hardware.
However, it can be very important when driving more advanced use cases. If one needs bigger guest sizes with more than a terabyte of memory then controlling the addressing bits is crucial.
-hpb machine types
Since Ubuntu 18.04 the QEMU in Ubuntu has provided special machine-types. These have been the Ubuntu machine type like
pc-i440fx-jammy but with a
-hpb suffix. The “hpb” abbreviation stands for “host-physical-bits”, which is the QEMU option that this represents.
For example, by using
pc-q35-jammy-hpb the guest would use the number of physical bits that the Host CPU has available.
Providing the configuration that a guest should use more address bits as a machine type has the benefit that many higher level management stacks like for example openstack, are already able to control it through libvirt.
One can check the bits available to a given CPU via the procfs:
$ cat /proc/cpuinfo | grep '^address sizes' ... # an older server with a E5-2620 address sizes : 46 bits physical, 48 bits virtual # a laptop with an i7-8550U address sizes : 39 bits physical, 48 bits virtual
maxphysaddr guest configuration
Since libvirt version 8.7.0 (>= Ubuntu 22.10 Lunar),
maxphysaddr can be controlled via the CPU model and topology section of the guest configuration.
If one needs just a large guest, like before when using the
-hpb types, all that is needed is the following libvirt guest xml configuration:
<maxphysaddr mode='passthrough' />
Since libvirt 9.2.0 and 9.3.0 (>= Ubuntu 23.10 Mantic), an explicit number of emulated bits or a limit to the passthrough can be specified. Combined, this pairing can be very useful for computing clusters where the CPUs have different hardware physical addressing bits. Without these features guests could be large, but potentially unable to migrate freely between all nodes since not all systems would support the same amount of addressing bits.
But now, one can either set a fix value of addressing bits:
<maxphysaddr mode='emulate' bits='42'/>
Or use the best available by a given hardware, without going over a certain limit to retain some compute node compatibility.
<maxphysaddr mode='passthrough' limit='41/>
By default libvirt will spawn QEMU guests using AppArmor isolation for enhanced security. The AppArmor rules for a guest will consist of multiple elements:
- A static part that all guests share =>
- A dynamic part created at guest start time and modified on hotplug/unplug =>
Of the above, the former is provided and updated by the
libvirt-daemon package and the latter is generated on guest start. Neither of the two should be manually edited. They will, by default, cover the vast majority of use cases and work fine. But there are certain cases where users either want to:
- Further lock down the guest, e.g. by explicitly denying access that usually would be allowed.
- Open up the guest isolation. Most of the time this is needed if the setup on the local machine does not follow the commonly used paths.
To do so there are two files. Both are local overrides which allow you to modify them without getting them clobbered or command file prompts on package upgrades.
This will be applied to every guest. Therefore it is a rather powerful (if blunt) tool. It is a quite useful place to add additional deny rules.
The above-mentioned dynamic part that is individual per guest is generated by a tool called
libvirt.virt-aa-helper. That is under AppArmor isolation as well. This is most commonly used if you want to use uncommon paths as it allows one to have those uncommon paths in the guest XML (see
virsh edit) and have those paths rendered to the per-guest dynamic rules.
Sharing files between Host<->Guest
To be able to exchange data, the memory of the guest has to be allocated as “shared”. To do so you need to add the following to the guest config:
<memoryBacking> <access mode='shared'/> </memoryBacking>
For performance reasons (it helps
virtiofs, but also is generally wise to consider) it
is recommended to use huge pages which then would look like:
<memoryBacking> <hugepages> <page size='2048' unit='KiB'/> </hugepages> <access mode='shared'/> </memoryBacking>
In the guest definition one then can add
filesytem sections to specify host paths to share with the guest. The target dir is a bit special as it isn’t really a directory – instead it is a tag that in the guest can be used to access this particular
<filesystem type='mount' accessmode='passthrough'> <driver type='virtiofs'/> <source dir='/var/guests/h-virtiofs'/> <target dir='myfs'/> </filesystem>
And in the guest this can now be used based on the tag
sudo mount -t virtiofs myfs /mnt/
Compared to other Host/Guest file sharing options – commonly Samba, NFS or 9P –
virtiofs is usually much faster and also more compatible with usual file system semantics. For some extra compatibility in regard to filesystem semantics one can add:
<binary xattr='on'> <lock posix='on' flock='on'/> </binary>
See the libvirt domain/filesytem documentation for further details on these.
virtiofsworks with >=20.10 (Groovy), with >=21.04 (Hirsute) it became more comfortable, especially in small environments (no hard requirement to specify guest Numa topology, no hard requirement to use huge pages). If needed to set up on 20.10 or just interested in those details - the libvirt knowledge-base about virtiofs holds more details about these.
See the KVM home page for more details.
For more information on libvirt see the libvirt home page.
Another good resource is the Ubuntu Wiki KVM page.
For basics on how to assign VT-d devices to QEMU/KVM, please see the linux-kvm page.