Optimizing boot time for embedded Linux systems

Boot time is a recurring topic for embedded software developers. While customers and product owners may not want boot time to be the main feature of a product, they still care about it significantly.
In large systems, the overall boot time is impacted by the individual boot times of its components. Let’s see how to optimize it to provide the best user experience possible.

Why should you optimize your boot time?

1 - Enhancing user experience

Users, like you and me, dislike waiting. We want our devices to be ready and operational as soon as possible. For instance, many people rarely switch off their smartphones because it takes too long to boot back up.

Additionally, we expect our devices to be responsive and provide immediate feedback. Take TV or computer monitors as an example: if the power button does not give immediate feedback, users might press it again, causing confusion and poor user experience.

2 – Make critical features available sooner

Critical features, such as parking sensors in vehicles, need to be available as soon as the vehicle starts moving. We need sensors and beepers to be operational immediately, while non-critical features like infotainment can be deferred. This prioritization ensures that essential functions are ready when needed.

boot time embedded linux system

How to improve your boot time?

1 – Understand the boot chain

First, it is essential to understand the boot chain from the bootloader to the application. Identify which services start, their dependencies, and the resources needed at each step.

2 – Measure the time it takes to boot

Second, measure the boot time and the duration of each step. Use tools like oscilloscopes, chronometers, and software tools such as grabserial, bootgraph, systemd-analyze.

Perform these measurements multiple times to ensure consistency across boots. Note that the initial boot may take longer due to extra steps like initializing random seeds or disk cache. Be aware of these conditions and their impact on boot time.

3 - Analyze how to optimize your boot time

There are three approaches:

  • Remove unnecessary components: Start with a generic Linux kernel and remove everything not needed for your product. As the saying goes, the fastest code is the code that is not executed.

  • Defer initialization: Defer some initializations to later stages when they are not needed at the beginning. Provide feedback at every stage of the boot sequence.

  • Optimize specific components: If the first two approaches are not enough, optimize specific components. This may involve rewriting algorithms or building a cache for reuse, which requires more effort.

Additionally, hardware components like faster CPUs, RAM, and storage mediums do impact boot time. For example, booting from an SD card was 15% slower than from eMMC on the same board, taking 17 seconds instead of 15 seconds. This illustrates the potential impact of hardware on boot time.

Optimization workflow: start by optimizing your application first

boot time optimization

Let’s look at the workflow for our optimization. A typical boot sequence for a Linux system starts with the bootloader, followed by the kernel execution, then systemd (or init V scripts in simpler systems), and finally the application.

While we discuss these steps in this order in this article, in your development cycle, you might prefer to work in reverse order, starting with optimizing your application first. This approach can be easier and less constraining for developers, making it simpler to follow, test, and investigate.

Bootloader

bootloader optimization

The bootloader initializes external RAM and other components such as ARM Trust Zone, Cortex-M, FPGA bitstreams, and peripherals like USB and Ethernet. It then loads the kernel into RAM and starts it.

To optimize this step, reduce the bootloader time by removing unnecessary drivers and initializations. For example, if USB is not needed at this stage, do not initialize it.

Also, eliminate any unnecessary waiting times, such as default countdowns, which are useful for developers but not needed in the final product. Removing these countdowns can save a few seconds.

Kernel loading

Next, the bootloader loads the kernel into RAM. Keep the kernel as small as possible by removing unnecessary drivers and features.

Some drivers can be compiled as external modules and loaded later. Consider whether a compressed or uncompressed kernel is better for your product and hardware. Compressing the kernel saves storage space but requires CPU cycles to decompress. Benchmark both options to determine the best choice.

Additionally, configuring eMMC as pseudo-SLC can speed up read and write operations, though it reduces capacity.

The kernel then initializes CPUs, memory, DMA, multitasking, and peripherals. Remove unnecessary components and defer initializations that can be loaded later. For example, compile Wi-Fi drivers as external modules if Wi-Fi is not needed early. Optimize the device tree blob (DTB) by deactivating unused peripherals. Removing console and logging can save time but may hinder developers, so do this at the end of the development cycle.

Use tools like bootgraph to measure time spent in each part of the kernel initialization and optimize accordingly.

kernel loading optimization

Systemd

After the kernel, systemd starts as the first user process. Systemd is designed to boot faster than init V scripts because it is coded in C and can parallelize tasks.

Clean up unused services and organize dependencies to parallelize as much as possible. Be mindful of dependencies specified with the “After=” keyword, which ensures one service starts only after another is ready.

Use tools like systemd-analyze plot to identify and optimize time-consuming services.

Focus on services that take the longest time and are on the critical path for boot time improvement.

Application

Finally, at the application level, start the business app and render the first screen to the user as quickly as possible.

Consider splitting the application into modules that load at different times. For example, some infotainment systems in vehicles load the Bluetooth module only when the user clicks on it, avoiding unnecessary initialization.

Coffee machine HMI

Use disk caching to optimize application performance, as seen with Qt QML.

To go further

If all the previous steps were not enough, here are some additional ideas to consider.

  • Compiling in release mode instead of debug will make your code smaller and faster to load. You can use suspend-to-RAM or similar features to make it appear as if the product is shut down, while it resumes very quickly when switched on again. This technique is commonly used in TV set-top boxes and decoders.

  • There are many other ideas you can explore, such as compiling to a smaller instruction set or configuring packages without unused options. This involves reviewing every package embedded in the root file system and recompiling them with fewer options. For example, in some vehicles, the onboard computer starts when the driver unlocks the car. By the time the driver is seated and ready to drive, the computer has already started, optimizing boot time.

If you need to remember one thing, it would be to determine which features you need and when, remove what is not needed, and make the boot process asynchronous. Start tasks in parallel and provide feedback to the user to let them know everything is working as expected.

Discover more from The Embedded Kit

Subscribe now to keep reading and get access to the full archive.

Continue reading