Operating System - Concurrency

Processes

Processes

The OS provides each process with a virtual execution environment. From the viewpoint of a process, it has a virtual CPU that only executes its own instructions (except those that are privileged) as well as a virtual memory that holds only the process’s code and data.

In the limited direct execution model, physical CPU directly executes the process’s instructions. This allows the OS to switch execution from one process to another.

Process Control Block

A process contains all of the state for a program in execution:

  • address space: code, data, stack
  • program counter: next instruction to be run
  • a set of general-purpose registers (GPR) with current values: so that they can be saved and restored at an interrupt
  • a set of OS resources: oepn file descriptors, process priority, page tables, resource usage info, etc.
  • process ID
  • process state

These information is stored in a data structure called the process control block (PCB).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
struct task_struct {  
/* -1 unrunnable, 0 runnable, >0 stopped */
volatile long state;
unsigned int flags; /* per task flags */
/* Scheduling priority and state */
int prio; int static_prio; int normal_prio;
struct list_head tasks; /* for building task queues */
int exit_state; int exit_code; int exit_signal;
pid_t pid;
struct files_struct *files; /* Open file information */
/* Signal handling */
struct signal_struct *signal;
struct sighand_struct *sighand;
sigset_t blocked;
...

Process Lifecycle

3 main states: Ready, Running, Blocked. Also New and Exited.

  • Ready: runnable, waiting to be scheduled by the OS
  • Running: currently running on the CPU under limited direct execution
  • Blocked: waiting for an external events (e.g. I/O, networking)

Running a Process

  1. Create new process: create PCB + user address space
  2. Load executable: initialize start state, change state to Ready
  3. Dispatch: OS selects a process for execution, change state to Running

State Queues

The OS maintains a collection of state queues that represent the state of all processes in the system. In Unix, this is implemented using doubly linked lists (struct list_head). As a process changes state, its PCB is unlinked from one queue and linked to another.

Context Switch

This can happen when:

  • Current process calls yield() system call - voluntary context switch
  • Current process makes a blocking system call
  • Timer interrupt handler (of the OS) decides to switch process

This is illustrated by the following pseudocode

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
TimerInterrupt - Hardware
Set to kernel mode
Save user registers of A to kernel stack for A
Set PC to TrapHandler

TrapHandler - OS
Call switch() routine
Save kernel registers for A to proc_struct (PCB) of A
Restore kernel registers for B from proc_struct of B
Switch to kernel stack for B
Call IRET

ReturnFromInterrupt (IRET) - Hardware
Restore user registers of B from kernel stack for B
Set PC to B's PC
Set mode to user mode

Creation of Process

A process is created by another process via a system call (privileged). init (pid=1) creates the first process.

In some systems, the parent defines its resources and privileges for its children. In Unix, Process User ID is inherited - children of the shell execute with the privilege level of the shell.

Fork

  1. Creates and initializes new PCB
  2. Copy most content from parent PCB to child
    • creates new address space and intialize it with the address space of parent
    • intializes kernel resources to point to resources used by parents (e.g. open fd)
  3. Places the PCB on the ready queue
  4. Set return register for two process (parents get pID of child, child gets 0)
  5. Return

Exec

  1. Stops the current process
  2. Loads new program into the address space of process
  3. Initialize hardware context and arguments for the new program
  4. Places the PCB onto the ready queue

exec() does not create new process. To run a new program as a new process, we fork() then exec().

Exit

On exit(), a process voluntarily releases all resources, but the OS does not discard everything immediately. Before cleaning up, we need to switch out of the process. The resources are cleaned by either (1) the parent calling waitpid() to collect (reaping) the exit information; or (2) the OS when it recognizes that the parent of the child also exited (process becomes a “zombie”)

Threads

Threads

Different processes each have their own private address spaces. To coordinate and communicate between processes we need pipes, signals, sockets, etc.

Creating (forking) new processes is inefficient:

  • Space: PCB, page tables, address space
  • Time: creating all the data structures, fork and copy address spaces
  • Interprocess communication: complicated to share and communicate across isolated processes

Idea of thread: separate the address space from execution state (run tasks in parallel while sharing address space). The benefit of doing so are

  • implicit sharing: solve single problem concurrently while sharing code, heap, and global variable
  • lighter weight: faster to create and destroy, potentially faster context switch
  • concurrent programming performance: overlapping computation and I/O

A program with multiple control flows is multithreaded.

Memory Space

Memory space of a multithreaded program

Kernel-Level Threads

Not the same as “kernel threads”.

Kernel-level threads or lightweighted processes are threads managed by the kernel. It makes concurrency cheaper than processes, but for fine-grained concurrency, kernel-level threads still suffer from too much overhead.

User-Level Threads

Managed entirely by the runtime system (user-level library). They are small and fast:

A thread is represented by

  • PC
  • registers
  • stack
  • small thread control block (TCB)

Creating user-level threads do not need involvement of the kernel. Creation, switching, and synchronization are done solely using procedure calls (instead of system calls).

But user-level threads may not be well-integrated with the kernel/OS. As a result, OS can make bad decisions such as scheduling a process with idle threads. There could also be security issues as the security mechanisms may not be good as the ones enforced by the kernel. Hence, a thread crash can take down all other threads, and a memory buffer overflow can create security problems for all other threads.

Lock

Conditional Variable

With locks, we can enforce mutual exclusion. Sometimes, we also need to enforce order of thread execution.

General idea: Put threads to sleep until condition is satisfied. We need to hold a lock while checking shared state to see if condition is already satisfied. If not, we release the lock before putting the thread to sleep. Need to re-acquire lock when thread wakes up, to confirm that the condition is now satisfied.

Implementation of Conditional Variables

3 operations:

  • pthread_cond_wait
    • release mutex, add thread to cv’s waiting queue, sleep
    • reaquire mutex and return
  • pthread_cond_signal
    • wake one enqueued thread
  • pthread_cond_broadcast
    • wake all enqueued thread
      Each sub-bullet point is an atomic operation. If no one is waiting, signal and broadcast do nothing.

Lost Wake-up Problem

1
2
3
4
5
6
7
8
9
// parent
thread_create(child);
lock_aquire(lock);
cv_wait(cv, lock);
lock_release(lock);

// child
cv_signal(cv);
thread_exit();
  • Issue: parent may never wake up.

  • Solution: Add a shared variable protected by a lock to indicate whether signal was sent.

Important: Conditional variables are always used with locks.

General Usage Pattern of CV

1
2
3
4
5
6
pthread_mutex_lock(mutex);
while (condition != 1) {
pthread_cond_wait(cond, mutex);
}
// do stuff
pthread_cond_signal(cond);

Difference Between Signal and Broadcast

Signal will pick a single “lucky” thread to wake up. Broadcast wakes up all the threads in the cv queue, but only one of the thread will acquire the mutex. Every other awoken threads will wait. However, they are no longer waiting in the cv queue. Instead, they are waiting simply to acquire the mutex. This means that the condition that the threads are waiting for may not be true when they eventually acquire the mutex.

Implementing pthread_join

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
void thr_join(long int thread_id, void **retval) {
pthread_mutex_lock(&join_mutex);
while (thread_status[thread_id].finished == false) {
pthread_cond_wait(&join_cond, &join_mutex);
}
*retval = thread_status[thread_id].exit_status;
pthread_mutex_unlock(&join_mutex);
}

void thr_exit(long int thread_id, void *exit_status) {
pthread_mutex_lock(&join_mutex);
thread_status[thread_id].exit_status = exit_status;
thread_status[thread_id].finished = true;
// multiple threads might be waiting
pthread_cond_broadcast(&join_cond);
pthread_mutex_unlock(&join_mutex);
}

Producer-Consumer Paradigm

Classic synchronization problem where the producer and consumer exchange data using a shared bounded buffer.

Implementing Bounded Buffer

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#define MAX 10
typedef struct buf_s {
int data[MAX];
int in_position;
int out_position;
int num_elements;
pthread_mutex_t buflock;
pthread_cond_t not_full;
pthread_cond_t not_empty;
} buf_t;

buf_t buffer;

void add_to_buffer(buf_t *b, int value) {
pthread_mutex_lock(&b->buflock);
while (b->num_elements == MAX) {
// wait until consumer reads and buffer is no longer full
pthread_cond_wait(&b->not_full, &b->buflock);
}
b->data[b->in_position] = value;
b_in_position = (b->in_position + 1) % MAX;
b->num_elements++;
// signal consumer that buffer is no longer empty
pthread_cond_signal(&b->not_empty);
pthread_mutex_unlock(&b->buflock);
}

int remove_from_buffer(buf_t *b) {
int val;
pthread_mutex_lock(&b->buflock);
while (b->num_elements == 0) {
// wait until the buffer is non-empty
pthread_cond_wait(&b->not_empty, &b->buflock)
}
val = b->data[b->out_position];
b->out_position = (b->out_position + 1) % MAX;
b->num_elements--;
// signal the producer that the buffer is no longer full
pthread_cond_signal(&b->not_full);
pthread_mutex_unlock(&b->buflock);
return val;
}

Semaphores

Semaphores are an abstract data type that provides synchronization. They include:

  • internal data (integer counter variable accessible through two atomic operations)
  • operations: wait (P or down), and signal (V or up)

Pseudocode:

1
2
3
4
5
Wait(Sem) {
while (Sem.count <= 0);
// spin
Sem.count--;
}
1
2
3
Signal(Sem) {
Sem.count++;
}

except that they are actually implemented as atomic operations.

In addition to test_and_set, there is also the atomic operation fetch_and_add, as illustrated by the pesudocode below

1
2
3
4
5
int fetch_and_add(int *old, int inc) {
int oldval = *old;
*old += inc;
return oldval;
}

Again, this is a hardware-level atomic instruction. Using faa instruction, we can have a more realistic implementation of wait and signal for semaphores.

1
2
3
4
5
6
7
8
9
Wait(Sem) {
while (faa(&Sem.count, -1) <= 0) {
faa(&Sem.count, 1);
}
}

Signal(Sem) {
faa(&Sem.count, 1);
}

wait decrements the internal counter. signal increments the internal counter.

Semaphore v.s. Lock

A binary semaphore with initial value of 1 can be used the same way as a lock. A lock has an “owner” and can only be released by the owner. On the other hand, different threads can call wait and signal on a semaphore.

Rendezvous Problem with Semaphores

Semaphore does not suffer from the lost wake-up problem. The internal count variable tracks if signal was called in the past.

1
2
3
// init
Sem semX = sem_init(0);
Sem semY = sem_init(0);

In thread A:

1
2
3
4
a1();
sem_signal(semX);
sem_wait(semY);
a2();

In thread B:

1
2
3
4
b1();
sem_signal(semY);
sem_wait(semX);
b2();

This ensures that a1 happens before b2, and b1 happens before a2.

Types of Semaphores

Mutex/Binary Semaphores

  • initialized to 1
  • allows single access to a resource
  • provides mutual exclusion to a critical section just like a lock

Counting Semaphore

  • initialized to N > 1
  • a resource with many units available, or a resource that allows certain kinds of unsynchronized concurrent access like reading
  • multiple threads can pass the semaphore
  • max number of threads is determined by semaphore’s initial value

Producer/Consumer Pattern with Semaphores

1
2
3
Sem mutex = sem_init(1);
Sem empty = sem_init(0);
Sem full = sem_init(N);

Producer:

1
2
3
4
5
6
7
8
9
Producer {
// wait if buffer is full
sem_wait(full);
sem_wait(mutex);
add_to_buffer();
sem_signal(mutex);
// buffer is no longer empty
sem_signal(empty);
}

Consumer:

1
2
3
4
5
6
7
8
9
Consumer {
// block if buffer is empty
sem_wait(empty);
sem_wait(mutex);
remove_from_buffer();
sem_signal(mutex);
// buffer is no longer full
sem_signal(full);
}

Reader and Writer with Semaphores

1
2
3
4
5
6
// number of readers
int readcount = 0;
// mutex for readcount
Sem mutex = sem_init(1);
// exclusive write or reader
Sem w_or_r = sem_init(1);

Writer:

1
2
3
sem_wait(w_or_r);
Write();
sem_signal(w_or_r);

Reader

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
sem_wait(mutex);
readcount += 1;
if (readcount == 1) {
sem_wait(w_or_r);
}
sem_signal(mutex);

Read();

sem_wait(mutex);
readcount -= 1;

if (readcount == 0) {
sem_signal(w_or_r);
}
sem_signal(mutex);

Limitations of Semaphores

Can be hard to reason about synchronization. Reason for waiting is embedded in the sem_wait() operation (count <= 0). If we need to enforce a more complex wait condition, then the condition needs to be checked outside the sem_wait. But also equally importantly, checking requires mutex. In case like this, it’s better to just use locks and condition variables.