OS Note - Thread & Concurrency

The process model introduced in Chapter 3 assumed that a process was an executing program with a single thread of control. Virtually all modern operat- ing systems, however, provide features enabling a process to contain multiple threads of control. Identifying opportunities for parallelism through the use of threads is becoming increasingly important for modern multicore systems that provide multiple CPUs.

In this chapter, we introduce many concepts, as well as challenges, associ- ated with multithreaded computer systems, including a discussion of the APIs for the Pthreads, Windows, and Java thread libraries. Additionally, we explore several new features that abstract the concept of creating threads, allowing developers to focus on identifying opportunities for parallelism and letting language features and API frameworks manage the details of thread creation and management. We look at a number of issues related to multithreaded pro- gramming and its effect on the design of operating systems. Finally, we explore how the Windows and Linux operating systems support threads at the kernel level.

1. Why Thread ?

process 要做 share memory 比較麻煩（透過 global variable）
process 間要溝通比較麻煩（pipe…)
process 間會互相競爭資源
thread context switch 的 overhead 比較小
注意：就算開三個 threads 還是只能一次執行一個 instruction（透過 context switch 來切換）

2. Recap: Server/Client Socket - Thread Version

每一個 thread 需要有自己的 stack area, # of stack pointer = # of threads
heap, data, text section 都是共享的

3. Multi-thread on Multi-processor

可以以類似 multiple processors OS 的方式實作
- 每個 core run 一個 thread (master-slave)
- 每個 core 可以 run multiple threads (symmetric)

Parallelism

Data parallelism: 把資料分成很多小塊給每個 thread
Task parallelism: 讓每個 thread 做不同的 task

4. Thread Control Block (TCB)

Register
Program Counter
Stack Pointer

5. Amdahl’s Law

\[\text{speedup}≤\frac{1}{S+\frac{1-S}{N}}\]

\(S\) is serial portion (不能平行化的)
\(N\) is # of cores

6. Multi-threading Model

ULT(User Level Thread) Scheduler 決定在 process 中哪一個 thread 可以執行
Scheduler（OS 的 scheduler）決定 User-space 的 process 跟 kernel-thread 的排程
如果某個 User-level thread 需要進入 kernel space（i.e. 呼叫 system call）他就會被 map 到一個 kernel-level thread，再由 OS scheduler 決定什麼時候要把它放進 processor 執行

Mapping between User and Kernel Threads

Many-to-One

好處：kernel space 不用 implement multi-thread（早期 OS 只有 user space 的 multi-thread）
問題：當 kernel-thread 被 block 住時，user-threads 就沒救了

One-to-One

好處：相較於 many-to-one 可以同時進行更多事
問題：Overhead 較高，因此一個 process 不能有太多的 thread

Many-to-many

通常 user thread 的數目會大於 kernel thread 的數目
解決 one-to-one overhead 過高的問題

Two-level Model (Hybrid)

雖然 many-to-many 看起來很屌，但因為他很難實作(要多做一個 scheduler)，所以目前大部分 OS 還是使用 one-to-one 的 model

7. Implicit Threading

讓 programmer 不用煩惱要開幾個 thread，一切由系統決定

A. Thread Pool

事先準備好一定數量的 worker thread 存在 thread pool 裡，需要的時候再 assign task 給他們
好處：通常會比一般的 thread 快一點，因為已經事先創建好了，沒有創建 thread 的 overhead
好處：藉由 pool 的大小，可以控管 thread 的最大數量
好處：動態決定每個 thread 要執行的 task，同一個 thread 不會有固定的工作

B. Fork-Join Parallelism

使用 Divide-and-Conquer 來實作平行化

C. Open MP

在程式碼中，使用# pragma omp parallel來提示 compiler 可平行化的區段
通常應用於科學/數學計算

D. Grand Central Dispatch

在程式碼中，使用^{}來提示 compiler 可平行化的區段

8. Threading Issue

fork() and exec()

有些 Unix 有兩種版本的 fork()，一種只會 fork 呼叫的 thread 給 child，另一種會 fork 全部的 thread
任一 thread 呼叫 exec()，皆會使全部 thread 被 kill 掉（因為要換 program 了）

Signal Handling

It’s system dependent.

用 pthread-version 的 kill(tid, signo) 把 signal 送給特定的 thread
把 signal 送給每個 thread (有些 thread 可能會 block 這個 signal)
讓一個 available 的 thread 來 handle
指派一個特定的 thread 來 handle signal

Thread Cancellation

Asychronous cancellation:馬上 cancel（不太好，可能會有 race condition）
Deferred cancellation:週期性檢查是否可以 cancel 或到達 cancellation point 時再 cancle

Thread Local Storage (TLS)

有時候 thread 會需要額外的空間，所以才需要 TLS
TLS 是跟隨著 thread 的，意思是就算 thread function 改變了（回想 thread pool），TLS 還是會是同一塊，直到 thread 結束
利用 static 宣告 TLS（才會每次都在同一個），並存放於 heap 中

Light Weight Process (LWP)

Light Weight Process 通常被放在 User / Kernel thread 之間，manage 兩個 thread 的對應關係與溝通
在 User thread 這邊，從 LWP 接收到可以執行的命令，因此會覺得 LWP 就像是一個 Virtual processor 一樣
每一個 LWP 是跟著 Kernel thread 走的