Chapter 2 Multithreading
2.1 Threads
std::threadhas no copy operations. It accepts a callable as work package, whose return value is ignored.The creator of
std::threadshould manage its lifecycle, i.e. it should invokejoin()to wait the thread ends ordetach()to detach itself from the thread. Actually, beforejoin()ordetach()is called, the thread is joinable, and the destructor of a joinable thread throws astd::terminateexception.One thing worth noting is that detached threads will terminate with the executable binary, which means when the main thread exits, all detached threads will also exit even if their work package hasn’t fully done. Take below for an example:
c++1
2
3
4
5
6
7int main() {
std::thread t([] { std::cout << "hello" << std::endl; });
t.detach();
// if this line is commented, "hello" may not be printed
// std::this_thread::sleep_for(std::chrono::milliseconds(1));
return 0;
}std::thread‘s constructor is a variadic template. So if you want to pass argument by reference, it needs to usestd::refeven if the parameter of the callable as work package is reference.We can use
swap()method to swap (in a move way) two threads.We can use
std::thread::native_handle()to get information about system-specific implementation ofstd::thread.
2.2 Shared Data
Insertion to and extracting from global stream objects (like
std::cin,std::cout) are thread-safe, although the output statements can interleave. In another word, writing tostd::coutis not a data race but a race condition (of output statements).There are many kinds of mutex. Most basically, there is a
std::mutex, which supportslock(),try_lock()andunlock(). Then it’sstd::recursive_mutex, which can lock many times and stay locked until unlock as many times as it has locked. There alsostd::timed_mutexandstd::recursive_timed_mutexwhich supporttry_lock_for()andtry_lock_until().std::shared_timed_mutex(since C++14) andstd::shared_mutex(since C++17) also provide a series of methods of*_lock_shared_*, which can be used to implement read-write lock (introduced later).Cool, right? Since we have mutex we can write some code like this:
c++1
2
3
4std::mutex m;
m.lock();
sharedVariable = getVar();
m.unlock();However, it’s quite prone to deadlock due to the
getVar(): what if it throws an exception? what if it also acquire the mutexm? what if it’s a library function and someday gets upgraded with some code you never know?So apparently, it’s better to avoid calling functions while holding a lock.
To solve deadlocks, we can use locks:
std::lock_guard,std::unique_lock,std::shared_lock(since C++14) andstd::scoped_lock(since C++17).First let’s look at
std::lock_guard. Maybe you’ve heard about RAII. Yep, that’s the mechanismstd::lock_guarduses to solve the deadlock which happens when you forget to release the lock (maybe because an exception is thrown):c++1
2
3
4
5{
std::mutex m;
std::lock_guard<std::mutex> lockGuard(m);
/* critical section */
}Then it’s
std::unique_lock, which is stronger but more expensive thanstd::lock_guard. For example it enables you to create a lock without locking the mutex immediately, recursively lock a mutex and so on.One thing worth noting is that we can use
std::lock(), which is a variadic template, to lock multiple mutexes in an atomic step:c++1
2
3
4std::mutex a, b;
std::unique_lock<std::mutex> guard1(a, std::defer_lock);
std::unique_lock<std::mutex> guard2(b, std::defer_lock);
std::lock(guard1, guard2);Here comes
std::shared_lock, which behaves likestd::unique_lock, except in the condition that it’s used withstd::shared_mutexorstd::shared_timed_mutex(which are introduced before). It can be used to implement a read-write lock. To be more precise,std::lock_guard<std::shared_mutex>orstd::unique_lock<std::shared_mutex>is used for write lock whilestd::shared_lock<std::shared_mutex>is used for read lock. This is essentially becausestd::shared_mutexsupports both*_lock_*and*_lock_shared_*methods which invoked separately bystd::unique_lockandstd::shared_lock.Finally it’s
std::scoped_lock. Still remember thestd::lock()function? Yep, they are very similar. Actually,std::scoped_lock‘s constructor is a variadic template, which 1) behaves like astd::lock_guardwhen there is just one mutex argument, 2) invokesstd::lock()when there are multiple mutex arguments.In another word,
std::scoped_lockcan lock many mutexes in an atomic step.Sometimes we need to ensure that objects are initialized in a thread-safe way (imagine the singleton design pattern), typically there are three ways to do that (ok, if you count in initializing objects in main thread before creation of child threads, there are four).
The first is use
constexprto initialize objects as constant expressions in compile time. Note that an object can be annotated asconstexpronly if its class satisfies some restrictions. For example, it cannot have virtual base class and virtual methods; it’s constructor must be empty (except for the initialization list) and const expression; its base classes and non-static members should all be initialized (in the initialization list) and so on.The second is to use
std::call_onceandstd::once_flag. The semantic is easy to understand:std::call_onceis a function, which accepts two parameters, the first one is astd::once_flagand the second one is a callable. We can invokestd::call_oncemany times with the samestd::once_flag, and exactly one callable of them will be executed exactly once.Use this to implement singleton:
c++1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23class MySingleton {
private:
static std::once_flag initInstanceFlag;
static MySingleton* instance;
MySingleton() = default;
~MySingleton() = default;
public:
MySingleton(const MySingleton&) = delete;
MySingleton& operator=(const MySingleton&) = delete;
static MySingleton* getInstance(){
std::call_once(initInstanceFlag, MySingleton::initSingleton);
return instance;
}
static void initSingleton(){
instance = new MySingleton();
}
};
MySingleton* MySingleton::instance = nullptr;
std::once_flag MySingleton::initInstanceFlag;The third is static variables with block scope. Those static variables are created exactly once and lazily, which means they won’t get created until used. And since C++11, there is another guarantee: static variables with block scope are created in a thread-safe way (but it seems to be dependent on compiler implementations). So we can write a singleton class like this:
c++1
2
3
4
5
6
7
8
9
10
11
12
13class MySingleton {
public:
static MySingleton& getInstance() {
static MySingleton instance;
return instance;
}
private:
MySingleton() = default;
~MySingleton() = default;
MySingleton(const MySingleton&) = delete;
MySingleton& operator=(const MySingleton&) = delete;
};
2.3 Thread-Local Data
Actually I’ve never heard about the
thread_localkeyword in C++ before. This keyword acts likestatic:if it qualifies a variable in namespace scope or as static class member, the variable will be created before its first usage:
c++1
2
3
4
5
6class A {
public:
thread_local static int x;
};
thread_local int A::x;if it qualifies a variable in a function, the variable will be created at its first usage:
c++1
2
3void f() {
thread_local int a;
}
The difference between
thread_localandstaticis that variables qualified by the former have lifecycle bound to the thread which created them while ones qualified by the latter have lifecycle bound to the main thread.
2.4 Condition Variables
std::condition_variableis literally a condition variable, which provides methods likenotify_one(),notify_all(),wait()and so on.The
wait()method usually accepts two parameters, the first one is astd::unique_lockand the second one is a callable called Predict. Let’s take a closer look.Why does the lock need to be a
std::unique_lockinstead ofstd::lock_guard? Be aware that whenwait()is invoked, the lock gets released and actually we will see later that the lock gets acquired and released repeatedly, so we need astd::unique_lockinstead of a one-timestd::lock_guard.Then what’s the role of Predict? When talking about condition variables, we should be clear about these two phenomena: lost wakeup and spurious wakeup. Lost wakeup is to say the notify could come before the wait while spurious wakeup is to say the thread in waiting state could wake up itself even if there is no notification. Predict is to solve these problems:
c++1
2
3
4
5
6
7
8std::unique_lock<std::mutex> lck(mutex_);
condVar.wait(lck, []{ return dataReady; });
// equivalent to
std::unique_lock<std::mutex> lck(mutex_);
while ( ![]{ return dataReady; }() ) {
condVar.wait(lck);
}The
dataReadyin the above example is a flag used to synchronize the notification. It doesn’t need to be an atomic, but it must be protected by a mutex (we can usestd::lock_guardhere):c++1
2
3
4
5{
std::lock_guard<std::mutex> lck(mutex_);
dataReady = true;
}
condVar.notify_one();If not protected by a mutex, it may happen that the modification to
dataReadyand notification are executed rightly after the Predict check and before the condition variable wait, which will cause the thread to wait forever.
2.5 Tasks
Task is also a mechanism to perform work package asynchronously. Different from threads, tasks are not necessarily in another thread. Actually, the workflow of tasks is to perform the work package and produce the promise, and the result can be synchronized through a future.
std::asyncis a simple way to create a task, and its return value is the future of the task. Other than the work package, we can pass in a policy when invokingstd::async, which can bestd::launch::deferredfor lazy evaluation orstd::launch::asyncfor eager evaluation.Also, it’s not necessary to assign the return value of
std::asyncto a variable. In another word, we can just invoke it and dismiss its return value, in which case the future is called fire and forget future:c++1
std::async(std::launch::async, []{ std::cout << "fire and forget" << std::endl; });
Note that we need
std::launch::asyncto make sure a eager evaluation because we have no future to wait on.However, there is an inconspicuous drawback here: future waits on its destructor until its promise is done. In the case of fire and forget futures, the futures are temporary, whose destructor gets invoked immediately after the
std::asynccreating them. So the async is actually, umm, a fake one:c++1
2
3
4
5
6
7
8
9
10
11std::async(std::launch::async, [] {
std::this_thread::sleep_for(std::chrono::seconds(5));
std::cout << "first thread" << std::endl;
});
/* waiting for the promise done */
std::async(std::launch::async, [] {
std::this_thread::sleep_for(std::chrono::seconds(1));
// get printed after 6 seconds instead of 1
std::cout << "second thread" << std::endl;
});std::package_taskis another way to create a task which is not executed immediately. Actually its usage typically consists of four steps:c++1
2
3
4
5
6
7
8
9
10
11// 1. create the task
std::packaged_task<int(int, int)> sumTask([](int a, int b){ return a + b; });
// 2. assign to a future
std::future<int> sumResult = sumTask.get_future();
// 3. do the execution
sumTask(2000, 11);
// 4. wait on the future
sumResult.get();To my understanding,
std::asynccombines the first three steps together and the task created by it cannot accept parameters.If we want to execute the task and wait on the future multiple times, it needs to invoke the
reset()method ofstd::packaged_task.std::promisecan set not only a value but also an exception withset_exception()method. If that’s the case, the corresponding future will encounter the exception when invoking theget()method.std::futurecan usevalid()method to check if the shared state is available and usewait_for()orwait_until()to wait with a timeout. The latter returns astd::future_status, which is a scoped enum class with enumerations ofdeferred,readyandtimeout(to be frank, I don’t know whatdeferredmeans)Different from
std::future,std::shared_futureis copyable and can be queried multiple times.We have two ways to get a
std::shared_future:get_future()method ofstd::promiseandshare()method ofstd::future. Note that after invocation ofshare(), thevalid()method ofstd::futureshall return false.I think it needs a clarification about available shared state here. We know
valid()method ofstd::futureorstd::shared_futureindicates whether an available shared state exists. In another word, if it returns true,wait()method can be called without exception; if it returns false,wait()will result in an exception.For initialized
std::future, before the firstget(),wait()orshare(), thevalid()will return true; while after that,valid()shall return false. And for initializedstd::shared_future,valid()shall always return true, which means you can always query on astd::shared_future.If the callable used in
std::asyncandstd::packaged_taskthrows an exception, it will be stored in the shared state (just like whatset_exception()method ofstd::promisedoes), and rethrown when queried by future. One thing worth noting is thatstd::current_exception()can be used to get the caught exception in the catch block.voidas the template argument,std::promiseandstd::futurecould be used for notification and synchronization. Compared to condition variables, the task-based notification mechanism could not perform synchronization multiple times (sincestd::promisecould only set its value once andstd::futurecould only query once) but needn’t a shared variable or mutex and isn’t prone to lost wakeup or spurious wakeup.So the conclusion is that if multiple synchronization is not needed, task-based notification mechanism is preferred.