Introducing etrace – a multi-purpose application profiling tool

Ian Johnson

on 30 October 2020

These days, the internal workings of Linux applications involve many different moving parts. Sometimes, it can be rather difficult to debug them when things go wrong or run slower than expected. Tracing an application’s execution is one way of understanding potential issues without diving into the source code. To this end, we wrote an app-tracing tool called etrace, designed to detect performance bottlenecks and runtime issues in snaps.

In this article, we will be taking a look at etrace with an overview of the basic functionality of etrace, and highlight its usage through several representative examples.

Etrace at a glance

Etrace is a generic tracing application, useful for three broad measurement and debugging purposes:

  • How much time it takes for an application to display a (graphical/UI) window on a screen.
  • The sequence of tasks created and executed by the main program during its runtime.
  • The list of files accessed during a program’s execution.

These metrics can be used to debug potential problems in snaps, understand what a snap is trying to do, or find performance bottlenecks in a snap. Of course, it also works with native Linux packages or any executable program as well, albeit with slightly reduced out of the box functionality – it doesn’t reinstall the native package for example – but can still trace the application and measure how long it takes to display a window.

Basic usage

Etrace is available as a snap – first we have to install it. Because etrace is used to run arbitrary programs, including other snaps and even traditional linux packages, it needs system-wide permissions via classic confinement, which can be accepted by using the –classic flag when running the below command. To install etrace:

snap install etrace --candidate --classic 

Note that currently, etrace only works with X11 systems.

The first use case of etrace is to measure how long it takes for a graphical snap application to display a window on a screen. We included this functionality in etrace partly because there were not any sufficiently effective tools or options to achieve this available out there. With etrace, it is very easy to do this.

Let’s start out with a simple snap, gnome-calculator, and run it in a loop 10 times to see how much time this execution takes. Please note you need to have gnome-calculator installed – snap install gnome-calculator. Here we use the –no-trace option because we don’t want the full tracing stack, we just want etrace to measure how long it takes to launch – we will get into the full tracing capabilities later on.

$ etrace --repeat=10 exec --use-snap-run --no-trace  gnome-calculator --cmd-stderr=/dev/null
Total startup time: 1.531152957s
Total startup time: 513.948576ms
Total startup time: 512.980061ms
Total startup time: 515.576753ms
Total startup time: 508.354472ms
Total startup time: 515.734329ms
Total startup time: 508.414271ms
Total startup time: 514.258788ms
Total startup time: 508.407346ms
Total startup time: 511.950964ms

When you run this command, you should see the gnome-calculator window show up very briefly 10 times. If this is the first time you have run gnome-calculator on this boot, it may be slower for reasons we explain in another blog post. The numbers shown indicate elapsed time it takes for the application to display a window. For non-graphical applications, it will be the time it takes before the application exits by itself.

This use case itself is great if you want to check snap startup time. Snap developers can use this first etrace functionality to help optimize their startup times and compare different application settings. Etrace also supports JSON output for integration with other scripts as well as CI, build systems for testing and metrics collection/data gathering.

Useful Snap Options

As mentioned earlier, snap applications may be slower to start on the very first execution due to system caches not being set up. Re-running the application multiple times will usually erase this one-time cost, but when measuring how long it takes for a snap to start, we want to specifically look at this worst case scenario when data is not cached in memory and the system has to do the maximum amount of work before the user sees the application. To accommodate this frequent use-case with etrace, we have two options which make this much easier. The first is the ability to reinstall the snap:

--reinstall-snap 

This option will remove the snap (and thus delete its root-owned data in system directories, but not user-data), and then reinstall the snap. This clears out system caches like fonts, internal databases, and other data that is typically owned by root, but does not delete data that is specific to a user, such as browsing history.

The second option allows you to handle user data:

--clean-snap-user-data

This flag will create a snapshot of the snap before deleting the user data. As mentioned, an example of user data would be a Web browser profile.

With both of these flags specified in the etrace command we clear more of the system’s cache and thus see more consistent startup times for gnome-calculator on the same machine:

$ etrace --repeat=10 exec --use-snap-run --no-trace  gnome-calculator --reinstall-snap --clean-snap-user-data --cmd-stderr=/dev/null
Total startup time: 1.524554867s
Total startup time: 1.531155857s
Total startup time: 1.526059853s
Total startup time: 1.533270808s
Total startup time: 1.535193187s
Total startup time: 1.538566123s
Total startup time: 1.529733654s
Total startup time: 1.528315879s
Total startup time: 1.53689303s
Total startup time: 1.023516385s

Here, we see that the startup times are slower than before as well as more consistent, due to the elimination of the caching that the system was doing in between runs. It is also worth noting that in the previous example, the startup was much slower than all the other ones because it was the first execution of gnome-calculator in this desktop session.

If your application requires additional things to be done before/after starting it up either to test a specific scenario or to clear out more caching that might be taking place, etrace supports running generic scripts via the –prepare-script and –restore-script options as well as –prepare-script-args and –restore-script-args.

Tracing task execution

The second use case for etrace is to check what tasks or processes a snap runs during its execution. This is useful to examine the snap’s behavior and troubleshoot snaps that do run well, fail to launch or provide no meaningful output.

While etrace normally will automatically close the graphical window created by the program, sometimes it’s necessary to not have this behavior, for example to test a specific action in the application or when working with a command line application which does not create a window. For these use cases, there is the –no-window-wait option to instruct etrace to wait for the program to exit by itself. 

Here we use etrace exec with full tracing turned on (by omitting the –no-trace option):

$ etrace exec telegram-desktop --no-window-wait
56 exec calls during snap run:
    Start   Stop     Elapsed       Exec
    0       95417    95.417022ms   /snap/bin/telegram-desktop
    14991   20267    5.276918ms    /usr/lib/snapd/snap-seccomp
    38522   39649    1.127004ms    /usr/lib/snapd/snap-device-helper
    40294   41350    1.055955ms    /usr/lib/snapd/snap-device-helper
    41984   43035    1.051902ms    /usr/lib/snapd/snap-device-helper
    43688   44741    1.053094ms    /usr/lib/snapd/snap-device-helper
...
    82329   83477    1.148939ms    /usr/lib/snapd/snap-device-helper
    88778   93460    4.682064ms    snap-update-ns
    95417   100613   5.196094ms    /usr/lib/snapd/snap-exec
    100613  212749   112.13684ms   /snap/telegram-desktop/1708/bin/desktop-launch
    105275  107645   2.36988ms     /usr/bin/date
    115309  118616   3.30615ms     /usr/bin/getent
...
    120239  122471   2.232074ms    /usr/bin/md5sum
    192968  196316   3.347873ms    /usr/bin/head
    199725  203120   3.395795ms    /usr/bin/ln
    204533  207864   3.331899ms    /usr/bin/rm
    208199  211477   3.277063ms    /usr/bin/ln
    212749  6000720  5.787970066s  /snap/telegram-desktop/1708/usr/bin/telegram-desktop
Total time:  6.000720024s
Total startup time: 6.008373172s

This is a table showing the total set of tasks that were run during the program’s full execution, including the start time and end time (in microseconds) as well as the total elapsed time for the task execution. This may help in debugging problems with broken snaps, for example an intermediate shell script may be misbehaving before the final program is executed, and this would be apparent to the snap developer who has an expectation of what the the snap should be doing in the normal case.

Tracing file accesses

The third use-case of etrace is to see what files are accessed by a snap at runtime. For example, classic snaps, since they are not confined, may access any file on a user’s system and thus can sometimes run into bugs where they are reading the ‘wrong’ file from the host and not those shipped with the snap itself, the way that snaps are designed to work. This third functionality is also helpful if you’re trying to understand the full scope of a program’s execution. It can also be used to get an explicit list of files that a graphical snap tries to access before displaying a window. 

To do this, use the file subcommand. Here we will look at a simple program, hello-world, which accesses a small number of files (many snaps end up accessing hundreds of files):

$ ./etrace file --use-snap-run hello-world --no-window-wait
Hello World!
6 files accessed during snap run:
    Filename                            Size (bytes)
    /dev/pts/0                          0
    /etc/ld.so.cache                    157250
    /home/user/                         4096
    /lib/x86_64-linux-gnu/libc-2.23.so  
    /lib/x86_64-linux-gnu/libc.so.6     2030544
    /snap/hello-world/29/bin/echo       31

Total startup time: 87.038215ms

This will show us the files that were accessed as well as their sizes. There are options to also show what program accessed which file, filter the files by their location, or filter the files by what programs accessed those files, similar to the lsof command. Etrace also supports a rich JSON output format with details such as when the file was accessed, what syscall accessed the file and others.

Summary

Hopefully you find this blog post and etrace useful the next time you need to measure how long snaps take to startup, or debug problems with your snap applications, or even just peek at what is going on under the hood of your favorite snap.

Etrace is undergoing active development and welcomes contributions! We will be posting more about etrace in the coming future with some more use cases and stories. As always, come join us on the snapcraft forum if you have comments or suggestions.

Photo by Steve Harvey on Unsplash.

Ubuntu desktop

Learn how the Ubuntu desktop operating system powers millions of PCs and laptops around the world.

Newsletter signup

Select topics you’re
interested in

In submitting this form, I confirm that I have read and agree to Canonical’s Privacy Notice and Privacy Policy.

Related posts

Popular snaps per distro (2020 edition)

Back in mid-2019, we wrote a blog post detailing and comparing the most popular snaps across multiple distributions – Arch Linux, CentOS, Debian, Fedora,...

The Hunt for Rogue Time – How we investigated and solved the Chromium snap slow startup problem

In October, we shared a blog post detailing significant snap startup time improvements due to the use of a new compression algorithm. In that article, we...

Snap speed improvements with new compression algorithm!

Security and performance are often mutually exclusive concepts. A great user experience is one that manages to blend the two in a way that does not compromise...