Skip to main content

Threading

About

Threading in Python is a way to achieve concurrency by running multiple threads within a single process. Python's threading module provides tools to create and manage threads, allowing multiple tasks to be performed concurrently.

Thread

A thread is a lightweight, independent (and smallest) unit of execution within a process. Threads share the same memory space of the process, which allows for efficient communication but also introduces challenges like race conditions. For simplicity, we can assume that a thread is simply a subset of a process :))

Important Notes

❗️❗️

While it might be tempting to think of threading as running two or more processors simultaneously on the same program, threads DO NOT actually execute in parallel within a single process

Threads do not truly run simultaneously in Python

  • Only one thread executes Python bytecode at a time in CPython
  • This is caused by Global Interpreter Lock (GIL), a lock that ensures only one thread can execute Python code at a time, even on a multi-core processor
  • Threads are actually just taking turns executing Python Code

Parallelism requires special assistance

  • If we want multiple tasks to truly run at the same time (parallelism), we have to bypass the GIL. This can be achieved with non-standard Python implementations (Jython etc) or with multiprocessing module, which creates separate processes instead of threads. Each process has its own Python interpreter and memory space, so they can run in parallel without being affected by the GIL

Implementation

Creating A Thread

To start a separate thread, we create a Thread instance and then tell it to .start()

Example from codecademy


import threading, time, random

# simulates waiting time (e.g., an API call/response)
def slow_function(thread_index):
time.sleep(random.randint(1, 10))
print("Thread {} done!".format(thread_index))

def run_threads():
threads = []

for thread_index in range(5):
individual_thread = threading.Thread(target=slow_function, args=(thread_index,))
threads.append(individual_thread)
individual_thread.start()

# at this point threads are running independently from the main flow of application and each other
print("Main flow of application")

# This ensures that all threads finish before the main flow of application continues
for individual_thread in threads:
individual_thread.join()

print("All threads are done")

run_threads()

This results in the following output

Main flow of application
Thread 1 done!
Thread 4 done!
Thread 3 done!
Thread 2 done!
Thread 0 done!
All threads are done

Synchronization

Threads share the same memory space, so data inconsistencies may arise. Thread synchronization acts as a mechanism which ensures that two or more concurrent threads do not simultaneously execute some particular program segment known as the critical section.

In python, synchronization tools like Lock help manage access to shared resources

lock = threading.Lock()

def safe_increment(counter):
with lock:
counter[0] += 1

Daemon Threads

A daemon thread runs in the background and terminates when the main program exits

def background_task():
while True:
print("Running in the background...")
time.sleep(2)

daemon_thread = threading.Thread(target=background_task, daemon=True)
daemon_thread.start()

Use Cases

  • IO Bound Tasks: Network requests, file I/O, or database operations that spend time waiting for external resources
  • Non-CPU-Bound Operations: Tasks where the GIL won't significantly impact performance, as threads can perform concurrent operations while waiting

Multithreading

Multithreading in Python involves running multiple threads concurrently within a single process. Threading is the broader concept, and multithreading is a specific case of it involving multiple threads