Chapter 2 Multithreading
2.1 Threads
std::thread
has no copy operations. It accepts a callable as work package, whose return value is ignored.The creator of
std::thread
should manage its lifecycle, i.e. it should invokejoin()
to wait the thread ends ordetach()
to detach itself from the thread. Actually, beforejoin()
ordetach()
is called, the thread is joinable, and the destructor of a joinable thread throws astd::terminate
exception.One thing worth noting is that detached threads will terminate with the executable binary, which means when the main thread exits, all detached threads will also exit even if their work package hasn’t fully done. Take below for an example:
c++1
2
3
4
5
6
7int main() {
std::thread t([] { std::cout << "hello" << std::endl; });
t.detach();
// if this line is commented, "hello" may not be printed
// std::this_thread::sleep_for(std::chrono::milliseconds(1));
return 0;
}std::thread
‘s constructor is a variadic template. So if you want to pass argument by reference, it needs to usestd::ref
even if the parameter of the callable as work package is reference.We can use
swap()
method to swap (in a move way) two threads.We can use
std::thread::native_handle()
to get information about system-specific implementation ofstd::thread
.
2.2 Shared Data
Insertion to and extracting from global stream objects (like
std::cin
,std::cout
) are thread-safe, although the output statements can interleave. In another word, writing tostd::cout
is not a data race but a race condition (of output statements).There are many kinds of mutex. Most basically, there is a
std::mutex
, which supportslock()
,try_lock()
andunlock()
. Then it’sstd::recursive_mutex
, which can lock many times and stay locked until unlock as many times as it has locked. There alsostd::timed_mutex
andstd::recursive_timed_mutex
which supporttry_lock_for()
andtry_lock_until()
.std::shared_timed_mutex
(since C++14) andstd::shared_mutex
(since C++17) also provide a series of methods of*_lock_shared_*
, which can be used to implement read-write lock (introduced later).Cool, right? Since we have mutex we can write some code like this:
c++1
2
3
4std::mutex m;
m.lock();
sharedVariable = getVar();
m.unlock();However, it’s quite prone to deadlock due to the
getVar()
: what if it throws an exception? what if it also acquire the mutexm
? what if it’s a library function and someday gets upgraded with some code you never know?So apparently, it’s better to avoid calling functions while holding a lock.
To solve deadlocks, we can use locks:
std::lock_guard
,std::unique_lock
,std::shared_lock
(since C++14) andstd::scoped_lock
(since C++17).First let’s look at
std::lock_guard
. Maybe you’ve heard about RAII. Yep, that’s the mechanismstd::lock_guard
uses to solve the deadlock which happens when you forget to release the lock (maybe because an exception is thrown):c++1
2
3
4
5{
std::mutex m;
std::lock_guard<std::mutex> lockGuard(m);
/* critical section */
}Then it’s
std::unique_lock
, which is stronger but more expensive thanstd::lock_guard
. For example it enables you to create a lock without locking the mutex immediately, recursively lock a mutex and so on.One thing worth noting is that we can use
std::lock()
, which is a variadic template, to lock multiple mutexes in an atomic step:c++1
2
3
4std::mutex a, b;
std::unique_lock<std::mutex> guard1(a, std::defer_lock);
std::unique_lock<std::mutex> guard2(b, std::defer_lock);
std::lock(guard1, guard2);Here comes
std::shared_lock
, which behaves likestd::unique_lock
, except in the condition that it’s used withstd::shared_mutex
orstd::shared_timed_mutex
(which are introduced before). It can be used to implement a read-write lock. To be more precise,std::lock_guard<std::shared_mutex>
orstd::unique_lock<std::shared_mutex>
is used for write lock whilestd::shared_lock<std::shared_mutex>
is used for read lock. This is essentially becausestd::shared_mutex
supports both*_lock_*
and*_lock_shared_*
methods which invoked separately bystd::unique_lock
andstd::shared_lock
.Finally it’s
std::scoped_lock
. Still remember thestd::lock()
function? Yep, they are very similar. Actually,std::scoped_lock
‘s constructor is a variadic template, which 1) behaves like astd::lock_guard
when there is just one mutex argument, 2) invokesstd::lock()
when there are multiple mutex arguments.In another word,
std::scoped_lock
can lock many mutexes in an atomic step.Sometimes we need to ensure that objects are initialized in a thread-safe way (imagine the singleton design pattern), typically there are three ways to do that (ok, if you count in initializing objects in main thread before creation of child threads, there are four).
The first is use
constexpr
to initialize objects as constant expressions in compile time. Note that an object can be annotated asconstexpr
only if its class satisfies some restrictions. For example, it cannot have virtual base class and virtual methods; it’s constructor must be empty (except for the initialization list) and const expression; its base classes and non-static members should all be initialized (in the initialization list) and so on.The second is to use
std::call_once
andstd::once_flag
. The semantic is easy to understand:std::call_once
is a function, which accepts two parameters, the first one is astd::once_flag
and the second one is a callable. We can invokestd::call_once
many times with the samestd::once_flag
, and exactly one callable of them will be executed exactly once.Use this to implement singleton:
c++1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23class MySingleton {
private:
static std::once_flag initInstanceFlag;
static MySingleton* instance;
MySingleton() = default;
~MySingleton() = default;
public:
MySingleton(const MySingleton&) = delete;
MySingleton& operator=(const MySingleton&) = delete;
static MySingleton* getInstance(){
std::call_once(initInstanceFlag, MySingleton::initSingleton);
return instance;
}
static void initSingleton(){
instance = new MySingleton();
}
};
MySingleton* MySingleton::instance = nullptr;
std::once_flag MySingleton::initInstanceFlag;The third is static variables with block scope. Those static variables are created exactly once and lazily, which means they won’t get created until used. And since C++11, there is another guarantee: static variables with block scope are created in a thread-safe way (but it seems to be dependent on compiler implementations). So we can write a singleton class like this:
c++1
2
3
4
5
6
7
8
9
10
11
12
13class MySingleton {
public:
static MySingleton& getInstance() {
static MySingleton instance;
return instance;
}
private:
MySingleton() = default;
~MySingleton() = default;
MySingleton(const MySingleton&) = delete;
MySingleton& operator=(const MySingleton&) = delete;
};
2.3 Thread-Local Data
Actually I’ve never heard about the
thread_local
keyword in C++ before. This keyword acts likestatic
:if it qualifies a variable in namespace scope or as static class member, the variable will be created before its first usage:
c++1
2
3
4
5
6class A {
public:
thread_local static int x;
};
thread_local int A::x;if it qualifies a variable in a function, the variable will be created at its first usage:
c++1
2
3void f() {
thread_local int a;
}
The difference between
thread_local
andstatic
is that variables qualified by the former have lifecycle bound to the thread which created them while ones qualified by the latter have lifecycle bound to the main thread.
2.4 Condition Variables
std::condition_variable
is literally a condition variable, which provides methods likenotify_one()
,notify_all()
,wait()
and so on.The
wait()
method usually accepts two parameters, the first one is astd::unique_lock
and the second one is a callable called Predict. Let’s take a closer look.Why does the lock need to be a
std::unique_lock
instead ofstd::lock_guard
? Be aware that whenwait()
is invoked, the lock gets released and actually we will see later that the lock gets acquired and released repeatedly, so we need astd::unique_lock
instead of a one-timestd::lock_guard
.Then what’s the role of Predict? When talking about condition variables, we should be clear about these two phenomena: lost wakeup and spurious wakeup. Lost wakeup is to say the notify could come before the wait while spurious wakeup is to say the thread in waiting state could wake up itself even if there is no notification. Predict is to solve these problems:
c++1
2
3
4
5
6
7
8std::unique_lock<std::mutex> lck(mutex_);
condVar.wait(lck, []{ return dataReady; });
// equivalent to
std::unique_lock<std::mutex> lck(mutex_);
while ( ![]{ return dataReady; }() ) {
condVar.wait(lck);
}The
dataReady
in the above example is a flag used to synchronize the notification. It doesn’t need to be an atomic, but it must be protected by a mutex (we can usestd::lock_guard
here):c++1
2
3
4
5{
std::lock_guard<std::mutex> lck(mutex_);
dataReady = true;
}
condVar.notify_one();If not protected by a mutex, it may happen that the modification to
dataReady
and notification are executed rightly after the Predict check and before the condition variable wait, which will cause the thread to wait forever.
2.5 Tasks
Task is also a mechanism to perform work package asynchronously. Different from threads, tasks are not necessarily in another thread. Actually, the workflow of tasks is to perform the work package and produce the promise, and the result can be synchronized through a future.
std::async
is a simple way to create a task, and its return value is the future of the task. Other than the work package, we can pass in a policy when invokingstd::async
, which can bestd::launch::deferred
for lazy evaluation orstd::launch::async
for eager evaluation.Also, it’s not necessary to assign the return value of
std::async
to a variable. In another word, we can just invoke it and dismiss its return value, in which case the future is called fire and forget future:c++1
std::async(std::launch::async, []{ std::cout << "fire and forget" << std::endl; });
Note that we need
std::launch::async
to make sure a eager evaluation because we have no future to wait on.However, there is an inconspicuous drawback here: future waits on its destructor until its promise is done. In the case of fire and forget futures, the futures are temporary, whose destructor gets invoked immediately after the
std::async
creating them. So the async is actually, umm, a fake one:c++1
2
3
4
5
6
7
8
9
10
11std::async(std::launch::async, [] {
std::this_thread::sleep_for(std::chrono::seconds(5));
std::cout << "first thread" << std::endl;
});
/* waiting for the promise done */
std::async(std::launch::async, [] {
std::this_thread::sleep_for(std::chrono::seconds(1));
// get printed after 6 seconds instead of 1
std::cout << "second thread" << std::endl;
});std::package_task
is another way to create a task which is not executed immediately. Actually its usage typically consists of four steps:c++1
2
3
4
5
6
7
8
9
10
11// 1. create the task
std::packaged_task<int(int, int)> sumTask([](int a, int b){ return a + b; });
// 2. assign to a future
std::future<int> sumResult = sumTask.get_future();
// 3. do the execution
sumTask(2000, 11);
// 4. wait on the future
sumResult.get();To my understanding,
std::async
combines the first three steps together and the task created by it cannot accept parameters.If we want to execute the task and wait on the future multiple times, it needs to invoke the
reset()
method ofstd::packaged_task
.std::promise
can set not only a value but also an exception withset_exception()
method. If that’s the case, the corresponding future will encounter the exception when invoking theget()
method.std::future
can usevalid()
method to check if the shared state is available and usewait_for()
orwait_until()
to wait with a timeout. The latter returns astd::future_status
, which is a scoped enum class with enumerations ofdeferred
,ready
andtimeout
(to be frank, I don’t know whatdeferred
means)Different from
std::future
,std::shared_future
is copyable and can be queried multiple times.We have two ways to get a
std::shared_future
:get_future()
method ofstd::promise
andshare()
method ofstd::future
. Note that after invocation ofshare()
, thevalid()
method ofstd::future
shall return false.I think it needs a clarification about available shared state here. We know
valid()
method ofstd::future
orstd::shared_future
indicates whether an available shared state exists. In another word, if it returns true,wait()
method can be called without exception; if it returns false,wait()
will result in an exception.For initialized
std::future
, before the firstget()
,wait()
orshare()
, thevalid()
will return true; while after that,valid()
shall return false. And for initializedstd::shared_future
,valid()
shall always return true, which means you can always query on astd::shared_future
.If the callable used in
std::async
andstd::packaged_task
throws an exception, it will be stored in the shared state (just like whatset_exception()
method ofstd::promise
does), and rethrown when queried by future. One thing worth noting is thatstd::current_exception()
can be used to get the caught exception in the catch block.void
as the template argument,std::promise
andstd::future
could be used for notification and synchronization. Compared to condition variables, the task-based notification mechanism could not perform synchronization multiple times (sincestd::promise
could only set its value once andstd::future
could only query once) but needn’t a shared variable or mutex and isn’t prone to lost wakeup or spurious wakeup.So the conclusion is that if multiple synchronization is not needed, task-based notification mechanism is preferred.