What’s New in Version 13 of Balena Python SDK

New functionalities and breaking changes on Balena Python SDK

In this blog post, we will provide you with a concise overview of the new features and breaking changes introduced in Version 13 of the balena Python SDK.

We have focused on enhancing the balena Python SDK to further streamline your IoT development workflow. This release brings a wide range of new features designed to enhance your experience with the SDK, as well as some breaking changes that may require your attention during the upgrade process.

What is the Python SDK?

The balena Python SDK was initially released over 8 years ago. It provides a way to interact with balena resources in a programmatic and automated way, allowing you to build your own customized workflow using balena backend resources.

Balena also provides the Node SDK which is utilized by our UIs, CLI and other applications. As time went by, the two SDKs diverged in features and interfaces. The new version of the Python SDK brings them closer together, bringing many functionalities that were exclusive to the Node SDK into the Python one and also better matching the Node SDK interface, which causes some breaking changes.

New Features

In this section we will go into more details on what the new features are and where/how they can be useful.

Customizable queries

The biggest change to the Python SDK is the addition of the PineJS Python client. Pinejs is our own API engine that drives balena. The client allows complex queries to hit our database in a more efficient and flexible way. Let’s assume that for all queries in this document the following code was executed to log in to our SDK:

from balena import Balena

balena = Balena()
balena.auth.login(username='[email protected]', password='mysecr3tp4assw0rd')

$select

Let’s start with the simple $select operator. If you are familiar with SQL, this is very similar to a SQL “SELECT” statement, where we can choose which fields the query will return for us. Let’s look into one very basic example first. Suppose we want to build an application with the SDK that allows us to check if all devices of a given fleet are online or not.

app_slug = "myorg/myapp"

devices = balena.models.device.get_all_by_application(app_slug)

table = [{"id": device["id"], "is_online": device["is_online"]} for device in devices]

print(table)

This works, but it is not exactly efficient in the sense that each device object brings all the information (even the unused ones), wasting bandwidth and time. With Python SDK v13 this code can be improved by changing the second line to:

devices = balena.models.device.get_all_by_application(app_slug, {"$select": ["id", "is_online"]})

This now runs faster and is more explicit. However, there is much more optimization that can be accomplished by using the $filter and $expand operators.

$filter

Similar to the “WHERE” SQL statement, this operator allows us to customize our queries to obtain different results. For example, if we want to modify the example application from $select to only output the devices of a given application that are offline, we could use:

devices = balena.models.device.get_all_by_application(app_slug, {"$filter": {"is_online": False}})

Of course this can be further improved to also contain our $select statement and allow it to fetch only what we use, for example:

devices = balena.models.device.get_all_by_application(app_slug, {
        "$select": ["id", "is_online"],
        "$filter": {"is_online": False}
    }
)

$orderby

Similar to the “ORDER BY” SQL statement, this operator allows us to sort the output query by a given key. For example, in the same application we have used so far we want to order the resulting table based on the device’s last_connectivity_event. This could be achieved with:

devices = balena.models.device.get_all_by_application(app_slug, {"$orderby": "last_connectivity_event desc"})

This can also be combined with the previous statements:

devices = balena.models.device.get_all_by_application(app_slug, {
        "$select": ["id", "is_online", "last_connectivity_event"],
        "$filter": {"is_online": False},
        "$orderby": "last_connectivity_event desc"
    }
)

$expand

Similar to the “JOIN” SQL statement, this operator allows us to include related data/resources in the same request. For example, let’s suppose that in the same example application we now want to know the device type name of that device (e.g. Raspberry Pi, Intel NUC…). If we simply run the initial example code like this:

app_slug = 'myorg/myapp'
devices = balena.models.device.get_all_by_application(app_slug)
print(devices[0]["is_of__device_type"])

The output will be something like {'__id': 30} which would require us to do another query into the release model to get the device type name. However not only this is inefficient but it also makes for more error prone code, and with the Python SDK v13 we can instead do the following:

app_slug = 'myorg/myapp'

devices = balena.models.device.get_all_by_application(app_slug, {
    "$expand": "is_of__device_type"
})

print(devices[0]["is_of__device_type"])

Notice that now the output dict will contain not only the device type id but also several other properties such as name and belong_to__device_family etc. The $expand itself can receive a query object, so we can for example, select only the properties that are interesting for us, like:

devices = balena.models.device.get_all_by_application(app_slug, {
    "$expand": {
        "is_of__device_type" : {
            "$select": "name"
        }
    }
})

Finally it can all be merged together with the previous examples as:

devices = balena.models.device.get_all_by_application(
    app_slug,
    {
        "$select": ["id", "is_online"],
        "$expand": {"is_of__device_type": {"$select": "name"}},
        "$filter": {"is_online": False},
        "$orderby": "last_connectivity_event",
    },
)
table = [
    {
        "id": device["id"],
        "is_online": device["is_online"],
        "last_connectivity_event": device["last_connectivity_event"],
        "device_type_name": device["is_of__device_type"][0]["name"],
    }
    for device in devices
]

print(table)
print(tabulate(table)) # if you want to pretty-print it with https://pypi.org/project/tabulate/ library

The ability to customize the queries is very powerful, however it does have a steep learning curve. These examples barely scratch the surface of what can be done with the PineJs client, and there are many more examples of what can be achieved with it using other operators (such as $count and comparison operators). In order to learn about the additional key names and properties please refer to the API Models docs.

Simplified environment switch

Before version 13 switching between balena backend instances (such as balena-cloud and balena-staging or your own balenaMachine) was slightly tricky and would not allow for two instances of the SDK pointing to different backends to live in the same script. This is now fixed and each SDK instance relies on the balena_host parameter that was provided during its initialization:

balena_prod = Balena()
balena_prod.auth.login(username="[email protected]", password="mysecr3tp4assw0rd")

balena_staging = Balena(
    {
        "balena_host": "balena-staging.com",
        "data_directory": "/home/otavio/staging-data",  ## Attention! Create the folder beforehand
    }
)

balena_staging.auth.login(
    username="[email protected]", password="mysecr3tp4assw0rd"
)

Now each object points to a different backend and will have its own scope. Be aware that for each new backend it will be required to point to a different “data_directory” created beforehand as the sdk stores information (such as the token for the requests) on disk and if two SDK instances point to the same folder one will eventually overwrite the other.

Multi-identifier queries

Balena SDK V13 now allows resources to be queried by multiple identifiers. For example, if we have a device with UUID=01ee734daed728a63186229c33ed606a and id=1111387 the device can be queried as:

device = balena.models.device.get(1111387)

device = balena.models.device.get("01ee734daed728a63186229c33ed606a")

device = balena.models.device.get("01ee734") ## short UUID

Same is valid for example for the application model which can be queried by slug, id or uuid

fleet = balena.models.application.get("myorg/myfleet")

fleet = balena.models.application.get(2113112)

fleet = balena.models.application.get("bce0eb86c83f2468b66ab560b6716183")

fleet = balena.models.application.get("bce0e") ## short UUID

NOTE: When possible, always use the ID or full UUID as using the short UUID might be slightly slower.

Interaction with Applications and Blocks

The applications model now supports programmatically creating blocks and apps rather than just fleet by using an additional parameter such as:

balena.models.application.create("my_fleet", "raspberrypi3", "myorg", "fleet")

balena.models.application.create("my_app", "raspberrypi3", "myorg", "app")

balena.models.application.create("my_block", "raspberrypi3", "myorg", "block")

Type hints

If you were familiar with the previous balena SDK docs, you might have noticed that the documentation had response examples for each function call. For Python SDK V13 we took a different approach: instead of having response examples in the docs, we will maintain Python Type Hints for each model. These are more insightful and useful while programming because they will ensure you have proper typings and most editors will also have auto-complete/suggestions. A few examples can be seen in the pictures below:

type_hint

Although these are useful for knowing which fields each device provides, be aware that if using $select or $expand operators, these might not always be matching. Also, especially when using $expand or navigation properties, some type casting might be necessary if you know what output type to expect, for example:

app_slug = "myorg/myapp"

devices = balena.models.device.get_all_by_application(
    app_slug, {"$expand": "is_of__device_type"}
)

dt = devices[0]["is_of__device_type"]

print(dt[0]["name"])  # Type system might complain about this

Now the typing system tells us that dt will be either a dict containing the __id key (which we call a balena.types.models.PineDeferred) or a list with balena.types.models.DeviceTypeType. Because we used the $expand operator on the is_of__device_type key we know that it will be the latter, so we can cast it to make the type system work better, for example:

from typing import cast, List
from balena.types.models import DeviceTypeType

app_slug = "myorg/myapp"
devices = balena.models.device.get_all_by_application(
    app_slug, {"$expand": "is_of__device_type"}
)

dt = cast(List[DeviceTypeType], devices[0]["is_of__device_type"])
print(dt[0]["name"])  # Type system is know happy about it

Keep in mind that the Python type hint system is just a suggestion for writing more readable code, and won’t give any runtime error like statically typed languages such as Java would if an object is of the wrong type at run time.

Additional improvements designed to bring our Python interface closer to the Node one include: allowing hostOS updates (HUP) of devices with development OS, exposing direct access to the balena.pine object and several performance improvements like internal queries fetching exactly and only the resources they need.

Breaking Changes

For users of balena SDK versions prior to V13, the following breaking changes should be noted:

Default IDs typing

In previous versions, IDs (even numeric ones) were passed as a string. For example, if your code looked like this:

device = balena.models.device.get('12345')

It now has to use the actual numeric id:

device = balena.models.device.get(12345)

The reason behind this change is that in order to support using UUIDs (as part of the multi-identifier queries feature) we needed a way to differentiate between numeric IDs and UUIDs.

For the application model specifically we no longer support querying using the application name only but rather this should be done with the application slug:

device = balena.models.application.get('my_app_name') # no longer works

# Now becomes:

device = balena.models.application.get('the_app_org/my_app_name')

The reason for that is that different orgs can have the same application name, but the slug will always be unique and thus generates less error prone code.

Notice that if you used something like:

app = balena.models.application.get_all()

balena.models.application.get(app[0]["id"])

Nothing changes as both the input and output types match.

Function renames/removals

Since one of the main reasons for this improved release was to increase the compatibility between Node SDK and Python SDK, a few functions were renamed, moved, or removed:

  1. Tags and environment models were migrated to be inside the belonging resource. For example instead of
    balena.models.tag.application.get_all() we now use balena.models.application.tags.get_all()
    Similar changes happened for TwoFA and other objects which now match the same scope as the Node one.
  2. The device set_to_release was renamed to pin_to_release
  3. The get_setup_key and generate_code methods were removed from TwoFactorAuth model as these should be handled by your 2FA provider and not the SDK.
  4. The Logs model now has a stop function that should be called in order to properly free all used resources.

Dropping support for end-of-life Python versions

As Python 3.7 is now end-of-life, moving forward we will only support Python 3.8.1+. There are no guarantees that older Python versions will work.

These are the known major breaking changes, however if you happen to find any unexpected behavior or errors do not hesitate to open a issue in: https://github.com/balena-io/balena-sdk-python/issues

Ongoing and Future Improvements

This document covered the majority of the external-facing improvements in the latest release of the Python SDK. However, some of the larger improvements were internal and not visible in the public interface (for example, increased test coverage). These improvements allow us to have a much more scalable SDK and prepares us for new features in the future that include:

Multi-Resources/Parallel queries

Some functions already allow you to specify just one operation for multiple devices. For example, if you want to deactivate multiple devices you can now do so with only one request:

balena.models.device.deactivate([12345, 13262, 98321, 23198])

This performs a single round trip to our database and a single request, which is much faster and uses less of the API rate limit. However, this is still an ongoing effort which can be improved in the future. To find out which functions already support it, check the function type hint; these usually will contain a possible type of List[int].

Request Throttling

If you run some heavy processes you might get blocked by our API Rate Limit. In the future we plan to add a configuration setting to allow you to specify the maximum requests per period for the SDK. It will then throttle your request rate to stay within this limit.

Final Thoughts

We’re very pleased to release this improved Python SDK and provide you with the insights and rationale behind it. Please let us know what you think by posting in the comments below or in our Forums and make sure to add to our Roadmap for new features!


Posted

in

Tags:

Start the discussion at forums.balena.io