Friday, 15 July 2016

Code Craft – Embedding C++: Multitasking

We’re quite used to multitasking computer systems today. Our desktops run email, a couple of browsers in different workspaces, a word processor, and a few other applications, apparently all at once. Looking behind the scenes using a system monitor or task manager program reveals a multitude of other programs running in support of our activities. Of course, any given CPU is running a maximum of one program at a time. Multitasking is simply the practice of switching between active processes fast enough to give the illusion of simultaneity.

The roots of multiasking go way back. In the early days, when computers cost tons of money, the thought of an idle system was anathema. Teletype IO was slow compared to the processor, and leaving the processor waiting idle for a card reader to slurp in the next card was outrageous. The gurus of the time worked to fill that idle time with productive work. That eventually led to systems that would run multiple programs at one time, and eventually to more finely grained multitasking within a program.

Modern multitasking depends on support from the underlying API of an operating system. Each OS uses its own techniques, making it difficult to write portable code. The C++ 2011 standard increased the portability of the language by adding concurrency routines to the Standard Template Library (STL). These routines use the API of the OS. For instance, the Linux version uses the POSIX threading library, pthread. The result is a minimal, but useful, capability for building upon in later standards. The C++ 2017 standard development activities include work on parallelism and concurrency.

In this article, I’ll work through some of the facilities for and pitfalls in writing threaded code in C++.

Creating Threads

To implement multitasking within a single program, the code is broken up into multiple tasks, or threads, that run at the same time. Declaring and running a thread is simple. The hard parts come later while managing them and handling interactions among threads.

A thread is created from anything that is callable. That means a function or a class with, typically, an operator() method. Here is an example with three different callable objects created as threads and some management techniques.

#include <thread>
using namespace std;

void fa() {
void fb(int a) {
class Fc {
        Fc(int a) :
                        mA { a } {
        void operator()() {
        int mA;

int mainx() {
        int value;

        thread t_fa { fa };
        thread t_fb { fb, value };
        thread t_fc { Fc { value } };


        /* ... */
        // code waits for one second

// code continues in the ‘main’ thread


        return 0;

There is nothing special about these functions or the class. They could be used in a program as functions or as class instances. They become independent threads when passed to the constructor of class thread along with any arguments that are required of the function or the operator() method. The thread class is a variadic template class with a variable length number of typenames. This capability allows the class constructor to forward the arguments to the function or class method.

Now that the thread is running, how do you stop it? Unfortunately there is no method provided by thread. But it is easy to create a way. One technique I’ve used is a simple global boolean flag passed as a reference:

#include <atomic>
static std::atomic_bool run { true };

void fr(int a, std::atomic_bool& run) {
        while (run) {

int value { 0 };
thread t_fr { fr, value, std::ref(run) };

The thread checks for run becoming false and exits. We’ll see later why the std::atomic_bool is used.

Arguments can be passed by value or pointer just as in calling a regular function. Passing a reference requires the use of std::ref, as demonstrated with run.

Working With Threads

After the threads are created the remainder of main() is executed. When main() exits the situation gets interesting. What happens to the running threads? What happens to the resources they may have allocated, like an open file or serial port? Also consider that you can start a thread in any function, not just main(), so what if fa() created another thread and then exited? What happens to that new thread?

The standard provides two ways of handling this situation: thread::join() and thread::detach(). When function calls join() it is hold until the thread completes. In the example this is done just before main() exits.

A detached thread runs independent of the rest of the program. Generally, a call to detach() should be made soon after the thread is created. When the creating function exits, the thread continues running, even beyond the end of main(). If detached threads are not stopped by using some technique, like shown with run above, an exception is thrown. This leaves program resources in an indeterminate state.

A thread that has been neither joined nor detached is a joinable thread. This can be tested by thread::joinable(). An attempt to join a detached thread throws an exception. If the state of the thread is uncertain it, should be checked by calling joinable().

What happens if a thread throws an uncaught exception? The standard specifies std::terminate() is called, which calls std::abort to end the program. You can avoid this by catching the exception or specifying a std::terminate_handler. The details for this are available in a C++ reference site or book.

The need to join with a thread, if appropriate, requires diligence akin to management of resources, e.g. files or memory. It is solved by the same approach which underlies one reason for the existence of classes: Resource Allocation Is Initialization (RAII). A class constructor performs initialization, which includes resource acquisition, while a destructor releases resources. A simple class (see Notes at end) to handle threads is:

struct thread_guard: thread {
        using thread::thread;
        ~thread_guard() {
                if (joinable()) {

The class thread_guard is a derived class of thread. The using thread::thread tells the compiler to use all of thread’s methods, eliminating the need to create a constructor for thread_guard, since we only want to provide a destructor. The destructor tests if the thread is joinable, i.e. that it isn’t detached, and does a join, if allowed. This provides two capabilities. The function creating a thread no longer needs to explicitly join the thread, although it safely can, before the function exits. If something interrupts the creating function, like an exception, the destructor will do the join.

In the example code the namespace this_thread is demonstrated with a call to this_thread::sleep_for(). This namespace provide three routines for controlling the timing of threads. You can, as illustrated, sleep for a period of time, sleep until a specified time, or simply yield. The fourth routine in the namespace gets the thread’s identification number. Its appearance in main() points out that it is also a thread. The chrono header is well worth studying since it provides convenient tools for working with time values in the form of clocks, specific points in time, and durations.

Racing Threads

Let’s go back to atomic_bool which is defined in the atomic header along with atomic versions of many standard types. Atomic variables are needed to prevent race conditions on a variable. It takes sometimes hundreds of processor cycles to read or write a variable, even something as small as a boolean or character. During that time an interrupt can occur or a new task swapped for the current task. If this newly executing code reads or writes the same variable the state of the variable is corrupted. For example, a 32 bit integer contains four bytes. The first task reads two bytes, is interrupted, the new routine writes all four bytes, and when the first task is restarted it reads the last two bytes. The original first two bytes it read are now invalid. An atomic operation prevents this from happening.

Other race conditions occur when attempting to access resources. If the tasks in the example were sending output to cerr such a race would occur. One task could start outputting text, be interrupted by a task swap, and the new task also start writing to cerr causing their outputs to intermix.

The long established technique for handling this is with mutual exclusion, shortened to mutex.

#include <mutex>
   std::mutex cerr_mutex;
   . . .
   cerr << "Hello Hackday! "  << '\n';

A mutex is locked when a task wants to access a resource and is unlocked when the task is done. This is a quick and inexpensive operation. If another task requests the mutex, the task is held until the mutex is released.

The use of multiple mutex can lead to a deadlock situation. Task A requests mutex X and Y, in that order. Task B requests Y and then X. Each can gain their first mutex ,but neither can obtain the second. The standard provides the function lock(X, Y, …) which waits until all the locks in the argument list are available.

The mutex header provides more classes and functions for handling race conditions so deserves careful study when using threads.

Peeking Behind the Curtains

It helps when doing multitasking to understand a little bit about what is happening behind the curtains. On Windows or Linux we are working with preemptive multitasking, in contrast to cooperative multitasking.

In the former, the system is driven by a timer interrupt to switch, using a scheduling algorithm, among the tasks running on the system. A cooperative multitasking system relies on tasks to voluntarily relinquish control so other tasks may be scheduled

The simplest preemptive scheduler is time-slicing where each task is allowed to run for a specific amount of time. If a task yields for IO, to wait on a mutex, or voluntarily, the next task is allowed to proceed. More sophisticated algorithms, even with cooperative multitasking, perform priority scheduling where a high priority task gets more time. In real-time system, tasks marked as real-time might get as much time as they need. With priority multitasking developers must assure that all tasks receive sufficient time. One reason they might not is when high priority tasks consume all the processing, starving lower priority tasks. A problem, dubbed priority inversion, can occur when a low priority task grabs a mutex preventing a higher priority task from running.

Multitasking Costs

Multitasking consumes time and memory resources. When a task swap occurs, the processor registers used by a task are pushed onto its stack. The next task’s registers are popped from that task’s stack to start it running. This obviously takes time and memory since each task must be allocated a stack. Setting the stack size is almost an art. Enough space must be allocated for the worst case number of function calls and local, stack based, variables the task might need. In addition, when an interrupt occurs to handle external events, say a serial port receives a character, the interrupt requires stack space.

The Arduino ecosystem generally does not support these forms of multitasking because the processors do not have the memory for the stacks required by multiple tasks. There are scheduler techniques usable on Arduinos that are cooperative but do not save the state of the task, pushing that burden onto the task itself. The C++ concurrency libraries are not usable since there is no underlying system to provide a multitasking API. The Arduino community has developed a large number of scheduler libraries to use on those systems.

Wrap Up

Multitasking is a useful technique for keeping the processor busy. It must be remembered that it isn’t a panacea and, as always, should be tested to make sure the overhead of multitasking isn’t costing more than it provides.

Another consideration is the impact on the organization of the code. Dividing a program into its logical parts as separate tasks can make creating, testing, and debugging the code easier. Individual developers can work more easily and independently on separate portions of the code. Even a single individual, like myself, finds it easier to concentrate on the the code for a specific task while ignoring other processes.

I’ve only touched on the complex requirements for creating a multitasking system. There are many other C++ capabilities available for working with tasks, including safely coordinating their activities and transferring data among them. Once you start hacking with larger code bases, breaking the code into multiple tasks may prove beneficial for your system and sanity.


The thread_guard class is from Bjarne Stroustrup, “The C++ Programming Language”, 4th Edition. Read this book for all you ever want to know about C++.

Filed under: Hackaday Columns, Raspberry Pi, robots hacks, Software Development

from raspberry pi – Hackaday
via Hack a Day

No comments:

Post a Comment



Donate Towards More Raspberry PI's for Projects