Portable Parallelism using Modern C++ and Threading Building Blocks is a two-day online training course with programming exercises taught by Michael Voss and Pablo Reble. It is offered online from 11AM to 3PM Eastern Time (EDT), Monday September 21st and Tuesday September 22nd, 2020 (after the conference).
Course Description
Threading Building Blocks (TBB) is a portable, open-source C++ library for threading that has been widely used since 2006. Over the years, the developers of TBB and its users have learned many lessons about implementing composable parallelism in real-world C++ applications, including what practices to use and what mistakes to avoid. This course will introduce the TBB library and discuss its relationship with modern C++. Attendees will become familiar with TBB’s generic parallel algorithms, flow graph, concurrent containers and scalable memory allocator. They will also learn about TBB’s ongoing relationship with the ISO C++ standard including the fundamental features introduced in C++11 (std::thread, std::mutex and std::atomic), the parallel execution policies introduced in C++17, the newest features introduced in C++20, as well as proposed features, like executors.
Threading Building Blocks is also part of the specification for oneAPI, a recently announced cross-industry, open, standards-based unified programming model for heterogenous programming. Attendees will be introduced to oneAPI and learned about the role of oneAPI’s Threading Building Blocks (oneTBB) in this cross-industry effort.
Finally, the class will provide tips on how to architect applications and libraries to provide composable parallel performance, based on the experiences of the TBB development team and its customer support teams. The instructors will demonstrate these tips with examples using TBB and the parallel algorithms introduced in C++17.
This course will mix presentations with hands-on exercises. Course materials will be provided via GitHub and hands-on exercises will be supported through Intel ® DevCloud for oneAPI.
Goals
- Learn about the features of TBB
- Learn about TBB’s complementary relationship with the parallelism features in C++
- Learn about oneTBB’s role in oneAPI.
- Learn how to create applications with portable, composable parallelism
Prerequisites
- Knowledge of C++11/14 (including templates)
- As an online class:
- A reliable internet connection is necessary
- The exact conference-call software will be announced later
- To participate in hands-on examples:
- Option 1: (the primary delivery method): Using the Intel ® DevCloud for oneAPI
- A limited, but hopefully sufficient, number of instances will be pre-arranged for attendee use
- All necessary software will be pre-installed on those systems
- Option 2: On the attendee’s local system
- A C++ compiler supporting C++14 or later
- git
- Due to time constraints, we will offer limited assistance for students that choose this option
- Option 1: (the primary delivery method): Using the Intel ® DevCloud for oneAPI
Outline
Day 1 (3 Total hours)
- An introduction to Threading Building Blocks (~ 1.5 hrs)
- A brief overview of TBB
- Hands-on:
- Setting up Intel ® DevCloud for oneAPI
- Or, installing TBB and the examples locally
- The libraries components
- The Generic algorithms
- The flow graph API
- Concurrent containers
- Scalable memory allocation
- Hands-on:
- Using TBB features
- Modern standard C++ parallelism features with TBB (~ 1.5 hrs)
- Features that TBB had that have been displaced by standard C++
-
tbb::thread, tbb::mutex and tbb::atomic
-
- The C++17 parallel algorithms with TBB as an execution engine
- Understanding execution policies
- seq, par, unseq, par_unseq
- Understanding execution policies
- TBB’s relationship to upcoming features: co-routines, executors and more.
- Hands-on:
- Using C++17 parallel algorithms with oneTBB
- Features that TBB had that have been displaced by standard C++
Day 2 (3 Total hours)
- Composability (~ 1.5 hrs)
- What is composability and why is it important
- The types of composability (nested, parallel and sequential)
- Pitfalls when creating parallel applications and libraries
- Tuning for an assume set of resources
- Oversubscription
- Affinity and locality
- Priorities
- Hands-on
- Exploring oneTBB’s performance features
- What is composability and why is it important
- Techniques for creating composable applications and libraries (~1.5 hrs)
- Data parallelism
- Relaxed sequential semantics
- Cache-oblivious algorithms
- Work stealing and recursive subdivision
- Avoiding oversubscription in nested parallelism
- Is it ok to sacrifice composability for performance?
- Demonstration
- Composability case studies