Data Plane Development Kit
The DPDK is a set of libraries and drivers for fast packet processing and runs mostly in Linux userland. It is a set of libraries that provide the so called “Environment Abstraction Layer” (EAL). The EAL hides the details of the environment and provides a standard programming interface. Common use cases are around special solutions for instance network function virtualization and advanced high-throughput network switching. The DPDK uses a run-to-completion model for fast data plane performance and accesses devices via polling to eliminate the latency of interrupt processing at the tradeoff of higher cpu consumption. It was designed to run on any processors. The first supported CPU was Intel x86 and it is now extended to IBM PPC64 and ARM64.
Ubuntu further provides some infrastructure to ease DPDKs usability.
This package is currently compiled for the lowest possible CPU requirements. Which still requires at least SSE3 and anything activated by -march=corei7 (in gcc) to be supported by the CPU.
The list of upstream DPDK supported network cards can be found at supported NICs. But a lot of those are disabled by default in the upstream Project as they are not yet in a stable state. The subset of network cards that DPDK has enabled in the package as available in Ubuntu 16.04 is:
DPDK has “userspace” drivers for the cards called PMDs.
The packages for these follow the pattern of
librte-pmd-<type>-<version>. Therefore the example for an intel e1000 in 18.11 would be
The more commonly used, tested and fully supported drivers are installed as dependencies of
dpdk. But there are way more in universe that follow the same naming pattern.
Unassigning the default Kernel drivers
Cards have to be unassigned from their kernel driver and instead be assigned to
uio_pci_generic is older and usually getting to work more easily, but also has less features and isolation.
The newer vfio-pci requires that you activate the following kernel parameters to enable iommu.
Or on AMD
On top for vfio-pci you then have to configure and assign the iommu groups accordingly. That is mostly done in Firmware and by HW layout, you can check the group assignment the kernel probed in
Note: virtio is special, dpdk can directly work on those devices without vfio_pci/uio_pci_generic. But to avoid issues by kernel and DPDK managing the device you still have to unassign the kernel driver.
Manual configuration and status checks can be done via sysfs or with the tool
dpdk-devbind.py [options] DEVICE1 DEVICE2 .... where DEVICE1, DEVICE2 etc, are specified via PCI "domain:bus:slot.func" syntax or "bus:slot.func" syntax. For devices bound to Linux kernel drivers, they may also be referred to by Linux interface name e.g. eth0, eth1, em0, em1, etc. Options: --help, --usage: Display usage information and quit -s, --status: Print the current status of all known network, crypto, event and mempool devices. For each device, it displays the PCI domain, bus, slot and function, along with a text description of the device. Depending upon whether the device is being used by a kernel driver, the igb_uio driver, or no driver, other relevant information will be displayed: * the Linux interface name e.g. if=eth0 * the driver being used e.g. drv=igb_uio * any suitable drivers not currently using that device e.g. unused=igb_uio NOTE: if this flag is passed along with a bind/unbind option, the status display will always occur after the other operations have taken place. --status-dev: Print the status of given device group. Supported device groups are: "net", "crypto", "event", "mempool" and "compress" -b driver, --bind=driver: Select the driver to use or "none" to unbind the device -u, --unbind: Unbind a device (Equivalent to "-b none") --force: By default, network devices which are used by Linux - as indicated by having routes in the routing table - cannot be modified. Using the --force flag overrides this behavior, allowing active links to be forcibly unbound. WARNING: This can lead to loss of network connection and should be used with caution. Examples: --------- To display current device status: dpdk-devbind.py --status To display current network device status: dpdk-devbind.py --status-dev net To bind eth1 from the current driver and move to use igb_uio dpdk-devbind.py --bind=igb_uio eth1 To unbind 0000:01:00.0 from using any driver dpdk-devbind.py -u 0000:01:00.0 To bind 0000:02:00.0 and 0000:02:00.1 to the ixgbe kernel driver dpdk-devbind.py -b ixgbe 02:00.0 02:00.1
DPDK Device configuration
The package dpdk provides init scripts that ease configuration of device assignment and huge pages. It also makes them persistent across reboots.
The following is an example of the file
/etc/dpdk/interfaces configuring two ports of a network card. One with
uio_pci_generic and the other one with
# <bus> Currently only "pci" is supported # <id> Device ID on the specified bus # <driver> Driver to bind against (vfio-pci or uio_pci_generic) # # Be aware that the two DPDK compatible drivers uio_pci_generic and vfio-pci are # part of linux-image-extra-<VERSION> package. # This package is not always installed by default - for example in cloud-images. # So please install it in case you run into missing module issues. # # <bus> <id> <driver> pci 0000:04:00.0 uio_pci_generic pci 0000:04:00.1 vfio-pci
Cards are identified by their PCI-ID. If you are unsure you might use the tool
dpdk_nic_bind.py to show the current available devices and the drivers they are assigned to.
dpdk_nic_bind.py --status Network devices using DPDK-compatible driver ============================================ 0000:04:00.0 'Ethernet Controller 10-Gigabit X540-AT2' drv=uio_pci_generic unused=ixgbe Network devices using kernel driver =================================== 0000:02:00.0 'NetXtreme BCM5719 Gigabit Ethernet PCIe' if=eth0 drv=tg3 unused=uio_pci_generic *Active* 0000:02:00.1 'NetXtreme BCM5719 Gigabit Ethernet PCIe' if=eth1 drv=tg3 unused=uio_pci_generic 0000:02:00.2 'NetXtreme BCM5719 Gigabit Ethernet PCIe' if=eth2 drv=tg3 unused=uio_pci_generic 0000:02:00.3 'NetXtreme BCM5719 Gigabit Ethernet PCIe' if=eth3 drv=tg3 unused=uio_pci_generic 0000:04:00.1 'Ethernet Controller 10-Gigabit X540-AT2' if=eth5 drv=ixgbe unused=uio_pci_generic Other network devices ===================== <none>
DPDK HugePage configuration
DPDK makes heavy use of huge pages to eliminate pressure on the TLB. Therefore hugepages have to be configured in your system.
The dpdk package has a config file and scripts that try to ease hugepage configuration for DPDK in the form of
/etc/dpdk/dpdk.conf. If you have more consumers of hugepages than just DPDK in your system or very special requirements how your hugepages are going to be set up you likely want to allocate/control them by yourself. If not this can be a great simplification to get DPDK configured for your needs.
Here an example configuring 1024 Hugepages of 2M each and 4 1G pages.
As shown this supports configuring 2M and the larger 1G hugepages (or a mix of both). It will make sure there are proper hugetlbfs mountpoints for DPDK to find both sizes no matter what your default huge page size is. The config file itself holds more details on certain corner cases and a few hints if you want to allocate hugepages manually via a kernel parameter.
It depends on your needs which size you want - 1G pages are certainly more effective regarding TLB pressure. But there were reports of them fragmenting inside the DPDK memory allocations. Also it can be harder to grab enough free space to set up a certain amount of 1G pages later in the life-cycle of a system.
Compile DPDK Applications
Currently there are not a lot consumers of the DPDK library that are stable and released. OpenVswitch-DPDK being an exception to that (see below) and more are appearing. But in general it might still happen that you might want to compile an app against the library.
You will often find guides that tell you to fetch the DPDK sources, build them to your needs and eventually build your application based on DPDK by setting values RTE_* for the build system. Since Ubuntu provides an already compiled DPDK for you can can skip all that.
DPDK provides a valid pkg-config file
to simplify setting the proper variables and options.
sudo apt-get install dpdk-dev libdpdk-dev gcc testdpdkprog.c $(pkg-config --libs --cflags libdpdk) -o testdpdkprog
An example of a complex (autoconfigure) user of pkg-config of DPDK including fallbacks to older non pkg-config style can be seen in the OpenVswitch build system.
Depending on what you build it might be a good addition to install all of DPDK build dependencies before the make, which on Ubuntu can be done automatically with.
sudo apt-get install build-dep dpdk
DPDK in KVM Guests
If you have no access to DPDK supported network cards you can still work with DPDK by using its support for virtio. To do so you have to create guests backed by hugepages (see above).
On top of that there it is required to have at least SSE3. The default CPU model qemu/libvirt uses is only up to SSE2. So you will have to define a model that passed the proper feature flags (or use host-passthrough).
An example can be found in following snippet to your virsh xml (or the equivalent virsh interface you use).
Also virtio nowadays supports multiqueue which DPDK in turn can exploit for better speed. To modify a normal virtio definition to have multiple queues add the following to your interface definition. This is about enhancing a normal virtio nic to have multiple queues, to later on be consumed e.g. by DPDK in the guest.
<driver name="vhost" queues="4"/>
Since DPDK on its own is only (massive) library you most likely might continue to OpenVswitch-DPDK as an example to put it to use.