26 May 2023 / Last updated: 26 May 2023

A Yearful of balenaEngine Progress

To many of you, balenaEngine is an invisible component hidden somewhere in the balena stack. If you adhere to the standard balena workflows, you don’t even need to deal directly with it: balenaEngine will just do its job–quietly and invisibly.
Now, you might be worried because you heard the news about this emperor who got scammed by tailors selling him invisible new clothes. Well, rest assured not all invisible things are made equal and I want you to have no doubt that balenaEngine is real and keeps getting better! So, here’s a summary to let you all know how it has improved over the last year or so.

What is balenaEngine again?

But before that, please allow me to start with a quick recap on what balenaEngine is and what role it plays.
As a balena user, you know how it goes. It all starts with a cool idea. Then you implement this cool idea as a set of services meant to run in a fleet of devices. balenaEngine runs on each of these devices, and is responsible for executing your services in containers. It also helps to make sure your services keep running smoothly and are kept up to date. In one sentence, balenaEngine is our container engine.
balenaEngine is open source, and is based on the Moby project–the same technology that powers Docker. You may ask, why not simply run Docker on the devices? The main reason is that Docker was created with data centers in mind, so it frequently isn’t ideal for the reality of many IoT and edge applications: unreliable Internet connection, restricted bandwidth, limited storage space, and devices that often have less RAM than your phone.
A Raspberry Pi 3 and a smart phone, side by side.
balenaEngine is designed for small devices living in the edge (and sometimes on the edge)–like this Raspberry Pi 3, which has less RAM and a slower CPU than my definitely-not-fancy phone.
That’s why balenaEngine offers features like delta updates (enables smaller downloads when updating your services), image pulls that minimize storage media wear-and-tear (thus protecting them against corruption), and reduced RAM usage.
With this out of the way, let’s see how the Engine has been improving lately.

Reliability

I said balenaEngine is invisible for most of you, and that’s a great thing in my book. The less you have to worry about it, the more you can focus on your own application. We therefore always look into ways to make the Engine more reliable.
One important set of changes meant to improve the balenaEngine health checks. We run these checks periodically to verify that the Engine is running as expected and automatically restart it if we detect issues. We enhanced the health checks in various important ways:
  • Health checks now run much more quickly. In the past we have seen cases in which health checks timed-out and thus caused the watchdog to restart the Engine (slower devices under high loads were particularly prone to this). The new health checks are much lighter weight, and therefore can avoid this issue.
  • Graceful termination in case of failure. Now, when a health check fails, the Engine gets restarted in a gentler way than before. This allows balenaEngine to stop gracefully, avoiding several cases that previously resulted in corrupted metadata.
  • Fixed some false alarms. We solved a rare bug that could cause a health check to fail even if the Engine was healthy.
  • No more writes to the storage media. Unlike the previous ones, the new health checks do not write any data to the storage media, hence helping to increase the lifetime of your storage devices. This is beneficial in particular to devices based on SD cards, which are notoriously susceptible to fail because of wear-and-tear.
A micro-SD card lying in front of a tombstone.
balenaEngine helps to prevent the premature death of SD cards due to wear-and-tear.
Another watchdog-related issue we have seen in the wild were timeouts during initialization. We thus have disabled timeouts during initialization, so even lower-specced devices can now start up without issues.
We also took action to prevent devices running out of storage due to accumulating core dump files. We have seen user devices failing because they ran out of storage because some bug in the user service was causing the container to crash periodically. This would generate a core dump file every crash, and after some time all storage space would be gone. Now core dumps are no longer created by default, but you can still re-enable them if needed.
balenaEngine performs a number of network-related tasks, and we took care of these too:
Finally, we landed some improvements to delta updates:
  • Solved a "slice bounds out of range" error that sometimes caused delta updates to fail on 32-bit devices.
  • Fixed an issue with delta-based Host OS updates. This would result in the download of full images (instead of deltas) when updating balenaOS.
  • Optimized delta creation. Generating deltas is now faster and uses less memory. Back to the theme of invisibility, most of you don’t create deltas manually but will benefit from this change nevertheless: we use balenaEngine in our back end services to create the deltas that accelerate your updates.

Delta on load

What about people to whom balenaEngine is not invisible? We have users operating in very particular conditions that require them to create customized mechanisms to deploy images. Imagine you have several devices operating in an environment with extremely slow and unreliable networking. Even with delta updates, it may be impractical to make each of these devices download images every time you want to update your services.
By investing some time in development, though, you could solve this issue. For example, you could download the required images only once and load them into each device using a local network or USB stick.
We added support to apply balena deltas when loading previously saved delta images as an additional tool for users dealing with scenarios like this.
Here’s how it works:
# Create a delta between an old and a new version of your service images.
balena-engine image delta my-service:old my-service:new --tag my-service:delta

# Save the delta to a tarball (*.tar  file).
balena-engine image save -o my-service-delta.tar my-service:delta

# And here comes the new part: you copy the tarball to a device that
# contains only the old image. On this device, you can now run this
# command to apply the delta. This recreates the new image locally.
balena-engine image load -i my-service-delta.tar
In summary, this feature gives more flexibility to users implementing workflows that diverge from the standard. It might also come handy if you use balenaEngine as a standalone tool (that is, outside of the balenaCloud platform).

Housekeeping

To wrap up, I want to highlight some of the work that feels more like housekeeping, but also has a positive impact to users:
  • We updated the main dependencies balenaEngine is based on–in particular, the Moby project and containerd. This brings a number of bug and security fixes.
  • We are once again making GitHub releases for every new balenaEngine version we create. These come handy for those of you who use the Engine as a standalone product.
  • Targeting this same audience, we improved our convenience installation script.
  • We now automatically create a new balenaOS release for each new balenaEngine release. This means Engine improvements are reaching your balena devices faster than ever!

Thanks!

I can’t finish without sending a big thank you to all our users who report issues, suggest improvements, discuss all sorts of things in the forums, and are always extremely supportive in our investigations! ♥️ to you all!
by Leandro M. BarrosProduct builder at balena