Why LZO was chosen as the new compression method
Tags: algorithm , compression , lzo , snapcraft , snapcraft.io , snapd , speed
Everyone wants fast applications. Recently, we provided a mechanism to make snap applications launch faster by using the LZO format. We introduced this change because users reported desktop snaps starting more slowly than the same applications distributed via traditional, native Linux packaging formats like Deb or RPM.
After a thorough investigation, we pinpointed the compression method as the primary slowdown. Once we introduced the change, some users started wondering why we chose LZO as the new compression method for snaps, given that there are “better” algorithms available. Here, we want to take you through the journey of understanding why we picked LZO, and what is next for the snap compression story.
The old way
Previously, the only supported compression format for snaps was XZ. This decision was borne out of two main determining factors: compatibility and size. One of the primary delivery targets for snaps (in addition to desktop users) is IoT devices, and so for those, we wanted to have the smallest possible size. Additionally, cross-distro support is very important with snaps, and so we also wanted to make sure that the compression format chosen would be compatible with the widest range of kernels. In both cases, the XZ method fit the bill nicely. Additionally, at the time the decision was made, some of the new algorithms such as ZSTD were not even in existence.
It’s important to mention a few overarching design goals of snaps before we go much further. The first thing is the choice of squashfs as the packaging mechanism – instead of distributing individual files as tarballs, etc. We wanted both determinism and ease of deployment whereby all users of a snap get the same files, and those files are not modifiable (by the user). To satisfy both goals, we selected squashfs, which is a compressed filesystem format that is mounted read-only.
Orthogonal to that goal was package integrity. It was crucial to have a design where the files that were delivered to users were cryptographically verified, as well. We wanted a system where the bits that make up the .snap file are uploaded to the store by a snap developer, and those same exact bits are delivered to the user without modification. A delta upload/download functionality is used to save on network bandwidth, but that merely reconstructs the original exact bits that the snap developer uploaded to the store. Therefore, we aren’t able to recompress the snap either on the user’s computer or on our servers as that would change the snap files. As such the snap that is uploaded can currently only be uploaded and distributed with a single compression setting.
Why we needed to switch
When we started looking into why desktop applications packaged as snaps were slower, we explored multiple hypotheses, and we focused on the difference in startup times between the various packaging formats. For example, here is a graph created at the start of these explorations, demonstrating how many milliseconds it takes to launch a snap application vs the same application packaged natively:
The y-axis in the chart is the time before the application’s window is visible on the screen, as measured by the etrace utility, with each bar corresponding to a single launch. All launches were performed with as much caching turned off as possible.
One hypothesis was centered around the decompression of the squashfs snap taking some time, so we set up tests to run and compare the performance and timing of various supported compression algorithms for squashfs, including: no compression, GZIP, LZO, ZSTD, and of course XZ.
Another idea was that dynamic library search paths may be slowing things down. While optimizing these does provide some improvement in snap startup times, the contribution was less significant than the change of the compression algorithm. We explored this topic in greater detail in a snapcraft.io forum post.
Yet another theory was that perhaps snaps were slow due to desktop “helper scripts”, which are auxiliary shell scripts used to configure applications to ensure compatibility with the graphical system such as X11 or Wayland. Like the dynamic library search paths, the optimization of these scripts did not contribute a significant change or reduction in the overall snap startup times.
Eventually, we narrowed down our integration on the compression, and found that indeed, using no compression for the squashfs files resulted in the quickest startup time for many different snaps, reducing it by an order of magnitude in some scenarios, on some systems.
Here is a graph of the various types of compression for the chromium snap specifically.
The graph above shows the startup time in milliseconds of the chromium snap, with various different compression algorithms used. All times were averaged across multiple runs from a “cold start” with no caching activated.
This also demonstrated that in fact the format we initially chose had the worst performance when it comes to this kind of decompression at application startup. At the time when we chose XZ as the format, we were mostly concerned with size and it was expected, if not fully appreciated, that XZ would be slower than other alternatives.
This investigation made clear that we should evaluate a use of a new compression format, to improve the startup times for snaps geared towards desktop experience rather than size-constrained IoT devices.
As a middle ground between size and decompression performance, we decided to allow snaps to “opt-in” to a better performing algorithm. Developers can choose an alternative compression format at build time (before uploading to the store). We decided it would be best not to enable the use of every available compression format, because users on older kernels and distros may not have necessary support for the new algorithm. Thus, if a user tried to install a snap compressed this way, they would not be able to install the application, which creates a poor user story.
What we switched to
Given all our constraints, the bit-for-bit guarantee, the wide cross-distro support and the security-driven consideration (currently) not to re-compress on the fly, we decided to select LZO as the second allowed compression format for snaps. We chose it because it is widely supported by old kernels and systems, it has relatively good performance (much better than XZ and not much worse than ZSTD, for example), and it still provides a good compression ratio, minimizing disk space usage (and network bandwidth).
Above, you can see a graph of the compression ratio for various representative snaps, where a higher compression ratio is better. We can see that LZO (in purple) has a worse compression ratio than XZ, and in fact is the worst of the 4 formats (except for “none” which by definition has no compression). However, it still has a ratio of around 2.5-3 for reasonably sized snaps such as chromium and mari0. For very large snaps like supertuxkart, the ratio is very close to 1 for all compression formats and so picking LZO vs XZ is not a big win either way. If you are a snap author and you want to enable your snap to use LZO, have a look at this forum post to see how you can easily make the switch.
Eventually, we would like to be able to support some sort of dynamic re-compression either on the store side or on the user’s machine, which could potentially allow users with newer kernels to use better compression formats, but still allow users with older kernels to use legacy formats that are supported there. This could also allow users who want to save space but are not as concerned about performance to choose a compression method with a higher compression ratio, or users who want the best performance and for whom space isn’t an issue to choose a more performant algorithm or even no compression.
User feedback and cross-distro support are both important with snaps. We originally chose xz for various compatibility reasons, but over time we discovered this to be suboptimal for desktop users looking for the best performance comparable with traditional packaging mechanisms. We investigated and selected LZO as an alternative which offers a good balance for many applications between size and speed.
We hope you found this article interesting and helpful. If you have any ideas or suggestions, please join the discussion at forum.snapcraft.io and let us know your thoughts. We look forward to improving snaps together with your feedback.
Photo by Kilian Seiler on Unsplash.
Learn how the Ubuntu desktop operating system powers millions of PCs and laptops around the world.