Embedded testing diary #1. Device fleets.

Common problems in embedded testing: how to manage device fleets, heterogeneous test setups, and resource contention across multiple hosts.

In this post series, we will describe common problems in embedded testing, share our findings, experiences and solutions.

This is the first post. Let's start with a simple problem.

You develop a new device. The firmware grows big, a lot of features are being created. One day you notice that testing setups become hard. You have to have 2–3 computers to test each device, but managing them becomes a huge problem.

Let's say that the minimum set of computers and their configuration required to test a device forms a test setup.

Magic of multi-agent testing

Multi-agent testing resolves this problem by introducing automatic management of these hosts. Once you start testing, all necessary changes in the configuration are done automatically.

Our TS Factory not only changes the configuration, but also tracks the history of those changes, gracefully rolling them back when the test is done.

This is done through the entity called Configurator. Configurator discovers all network interfaces, virtual machines, software agents, WiFi configuration, and serial configuration — and tracks it all the way through. Any test may change configuration at any time, with any complexity, and after the test all changes will be phased out.

Heterogeneous test setups

One of the challenges is supporting heterogeneous test setups. Different hosts in your network might have different capabilities and resources.

One host supports firewall setup — the other doesn't.

One device is Windows-based, the other is Linux-based.

Even among Linux devices, slight incompatibilities between glibc versions can render a single agent unable to run on all hosts.

There are obvious ideas like building for musl. It is a good starting point, but it doesn't fully cover the problem of different functionality and different libraries.

To fully fix the problem, we've built Builder. The testing engine fully understands the target system and rebuilds the agent for each host transparently — you don't need to worry about it. It just works.

Resource management

The other problem is resource management.

Say, the host has one WiFi adapter that can be used by one test configuration at a time. Can it be used by other test configurations simultaneously? Most likely, no. It is important to have exclusive rights to this resource. Therefore: lock the particular resource while tests run, and release it as soon as tests finish.

Sounds easy? Yes. Is it actually easy? It is not.

One of possible test setups

What we find important is the concept of interleaving setups. It fixes a number of problems related to resource management, especially route management when dealing with networking devices.

Let's say you have 3 test configurations: A, B, and C. Each configuration requires exclusive access to a WiFi adapter. And you have three hosts: H1, H2, and H3.

Config A Config B Config C H1 H2 H3 WAN LAN WiFi WAN subnet 10.1.x LAN subnet 10.2.x WiFi subnet 10.3.x WiFi subnet 10.4.x WAN subnet 10.5.x LAN subnet 10.6.x LAN subnet 10.7.x WiFi subnet 10.8.x WAN subnet 10.9.x

Since LAN/WAN/WiFi subnetworks are configured differently for each configuration, their traffic routes don't intermix. At the same time, it is easy to log into any device during testing and investigate what's happening.

That only works if multi-agent testing is used, if heterogeneous test setups are supported, and if resources are managed properly.

Conclusion

We reviewed a few problems that appear during QA device fleet management: resource management, heterogeneous setups, and multi-agent operation. The interleaving setup scheme described above is one approach that works in real-world deployments.

In the next post, we'll dig into another common challenge from our embedded testing practice.