Chapter 19: Advanced Python Programming – Threading


19.1 Introduction

Threading is an essential aspect of advanced programming that enables a program to run multiple operations concurrently. In Python, threading allows for multitasking, making applications more efficient, especially when dealing with I/O-bound tasks. This chapter explores threading in Python, covering basic to advanced concepts with examples, use cases, and caveats.


19.2 What is Threading?

A thread is the smallest unit of a CPU's processing capability. Multithreading is a programming technique where multiple threads are spawned by a process to execute tasks concurrently. Python’s threading module provides a way to create and manage threads.

Benefits of Threading

  • Improves performance for I/O-bound tasks

  • Efficient use of system resources

  • Enhances responsiveness in GUI and network applications


19.3 The threading Module

Python’s built-in threading module is the standard library for creating and managing threads.

Creating a Thread

import threading

def print_hello():
    print("Hello from thread!")

# Creating a Thread
t1 = threading.Thread(target=print_hello)
t1.start()
t1.join()

Explanation:

  • Thread(target=function_name) creates a thread

  • start() begins the thread’s activity

  • join() waits for the thread to complete


19.4 Thread Class and Object-Oriented Threading

Threads can also be created by inheriting from the Thread class.

Example:

import threading

class MyThread(threading.Thread):
    def run(self):
        print(f"Thread {self.name} is running")

t = MyThread()
t.start()
t.join()

19.5 Daemon Threads

Daemon threads run in the background and automatically terminate when the main program ends.

Example:

import threading
import time

def background_task():
    while True:
        print("Running in the background...")
        time.sleep(1)

daemon = threading.Thread(target=background_task)
daemon.setDaemon(True)
daemon.start()

time.sleep(3)
print("Main thread exits")

19.6 Synchronization with Locks

Race conditions occur when threads access shared resources simultaneously. Lock is used to prevent this.

Using Locks:

import threading

lock = threading.Lock()
counter = 0

def increment():
    global counter
    for _ in range(100000):
        with lock:
            counter += 1

t1 = threading.Thread(target=increment)
t2 = threading.Thread(target=increment)

t1.start()
t2.start()
t1.join()
t2.join()

print("Final counter:", counter)

19.7 Thread Communication with Event, Condition, and Queue

Event Object:

import threading

event = threading.Event()

def task():
    print("Waiting for event...")
    event.wait()
    print("Event triggered!")

thread = threading.Thread(target=task)
thread.start()

input("Press Enter to trigger event...\n")
event.set()

Using Queue:

Thread-safe communication between threads can be done using the queue module.

import threading
import queue

q = queue.Queue()

def producer():
    for i in range(5):
        q.put(i)
        print(f"Produced {i}")

def consumer():
    while True:
        item = q.get()
        print(f"Consumed {item}")
        q.task_done()

t1 = threading.Thread(target=producer)
t2 = threading.Thread(target=consumer, daemon=True)

t1.start()
t2.start()
t1.join()
q.join()

19.8 Thread Pooling with concurrent.futures

Thread pools allow for easier thread management.

from concurrent.futures import ThreadPoolExecutor

def square(n):
    return n * n

with ThreadPoolExecutor(max_workers=3) as executor:
    results = executor.map(square, [1, 2, 3, 4, 5])

print(list(results))

19.9 Global Interpreter Lock (GIL) and Python Threads

Python has a Global Interpreter Lock (GIL) which allows only one thread to execute at a time in a single process. This makes Python threads less suitable for CPU-bound tasks but still effective for I/O-bound tasks.

Alternatives for CPU-bound tasks:

  • Use the multiprocessing module for parallel processing

  • Use Jython or IronPython which don’t have GIL


19.10 Common Pitfalls and Best Practices

Pitfalls:

  • Misuse of shared resources without locks

  • Deadlocks due to improper lock acquisition

  • Starvation if priorities are not managed

Best Practices:

  • Always use join() to wait for thread completion

  • Use with lock: context for safety

  • Prefer ThreadPoolExecutor for simple tasks

  • Avoid using threads for heavy computation


19.11 Real-world Applications

  • Web Scraping: Using multiple threads to fetch URLs simultaneously

  • GUI Applications: Background threads to prevent UI freeze

  • Network Servers: Handle multiple connections using threading


19.12 Summary

Threading is a powerful tool in Python that allows concurrent execution of code, mainly suited for I/O-bound tasks. Although Python's GIL limits true parallelism for CPU-bound tasks, threading can greatly improve responsiveness and efficiency when used correctly. Through synchronization mechanisms like locks and thread pools, Python developers can harness multithreading safely and effectively.


19.13 Exercises

  1. Write a Python program that creates two threads: one to print even numbers and another to print odd numbers up to 50.

  2. Modify the counter increment example to remove the lock and observe the output.

  3. Develop a producer-consumer model using queue.Queue with multiple producers and consumers.

  4. Implement a thread pool using ThreadPoolExecutor to download contents from multiple URLs (hint: use requests.get()).

  5. Simulate a real-world scenario using Event where a worker thread waits for a signal from the main thread.


19.14 Further Reading

Comments