Python Concurrency Mastery

Python Concurrency Mastery | Python 並發大師班 - Threads vs Processes Interactive Guide

Python Concurrency Mastery

Interactive guide to mastering Threading vs Multiprocessing with hands-on experiments

Understanding Python Concurrency Fundamentals

The GIL Reality

The Global Interpreter Lock (GIL) is a mutex that prevents multiple native threads from executing Python bytecodes simultaneously. This means that even on a 64-core machine, only one thread can run Python code at any given moment.

GIL Impact on Different Workloads:

  • CPU-bound: Threads take turns, adding overhead
  • I/O-bound: GIL released during waits, enabling true concurrency

Threading vs Multiprocessing

Aspect Threading Multiprocessing
Memory Shared Isolated
GIL Impact Affected None
Startup Cost Low High
Best For I/O-bound CPU-bound

Workload Type Identification

🧮

CPU-bound Tasks

  • Mathematical calculations
  • Image/video processing
  • Machine learning
  • Data crunching
Use Multiprocessing
🌐

I/O-bound Tasks

  • Network requests
  • File operations
  • Database queries
  • Web serving
Use Threading/Asyncio
⚖️

Mixed Workloads

  • Web scraping + processing
  • API calls + computation
  • File I/O + analysis
  • Real-time processing
Use Hybrid Approach

GIL Behavior Interactive Visualizer

Threading with GIL

T1
T2
T3
T4

Notice how only one thread (green) can execute at a time due to the GIL, even for CPU-bound tasks.

Multiprocessing (No GIL)

P1
P2
P3
P4

Multiple processes can run simultaneously as each has its own GIL and Python interpreter.

I/O vs CPU Bound Simulation

I/O-bound Scenario (Network Requests)

CPU-bound Scenario (Prime Calculation)

Performance Benchmarking Laboratory

Real-time Performance Comparison

Test Configuration

5000
4

Performance Metrics

--
Threading (ms)
--
Multiprocessing (ms)
--
Speedup Ratio
--%
Efficiency
Threading Progress:
Multiprocessing Progress:

Benchmark Results Visualization

Historical Results:

Interactive Code Examples & Patterns

Threading Examples

Threading Code

Multiprocessing Examples

Multiprocessing Code

Best Practices & Patterns

Hybrid Approach

# Use both for optimal performance
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor

def hybrid_processing():
    # CPU work in processes
    with ProcessPoolExecutor() as cpu_executor:
        cpu_futures = [cpu_executor.submit(cpu_task, data) 
                      for data in cpu_data]
    
    # I/O work in threads
    with ThreadPoolExecutor() as io_executor:
        io_futures = [io_executor.submit(io_task, url) 
                     for url in urls]

Memory Sharing

# Avoid pickle overhead
from multiprocessing import shared_memory
import numpy as np

def use_shared_memory():
    # Create shared array
    arr = np.random.random((1000, 1000))
    shm = shared_memory.SharedMemory.create(
        name='large_array', 
        size=arr.nbytes
    )
    
    # Use in multiple processes
    shared_arr = np.ndarray(
        arr.shape, dtype=arr.dtype, buffer=shm.buf
    )

Error Handling

# Robust concurrency patterns
from concurrent.futures import as_completed

def robust_processing(tasks):
    results = []
    failed = []
    
    with ThreadPoolExecutor() as executor:
        future_to_task = {
            executor.submit(process_task, task): task 
            for task in tasks
        }
        
        for future in as_completed(future_to_task):
            task = future_to_task[future]
            try:
                result = future.result()
                results.append(result)
            except Exception as exc:
                failed.append((task, exc))

Concurrency Decision Assistant

Smart Recommendation Engine

Describe Your Workload

10 concurrent tasks

Recommendation Result

🤖

Configure your workload parameters and click 'Get Recommendation' to receive personalized advice.

Performance Prediction

Architecture Patterns Decision Matrix

Scenario Best Choice Why? Example
Web scraping thousands of pages Threading + Asyncio I/O-bound, GIL released during network waits requests + concurrent.futures
Image processing pipeline Multiprocessing CPU-bound, benefits from true parallelism ProcessPoolExecutor + PIL
Real-time web API serving Asyncio Many concurrent connections, minimal CPU per request FastAPI + uvicorn
Scientific computing with NumPy Multiprocessing CPU-bound, NumPy releases GIL but multiprocessing scales better ProcessPoolExecutor + numpy
Database-heavy operations Threading I/O-bound with connection pooling ThreadPoolExecutor + SQLAlchemy
Machine learning training Hybrid CPU + I/O mixed, data loading + computation Multiprocessing + Threading

Quick Decision Flowchart

Click on decision nodes to navigate through the flowchart

此網誌的熱門文章

自訂網路結構的神經網路訓練與預測 (動畫+公式+損失/激活函數

Customizable Neural Network Training and Prediction (Animations + Formulas + Loss/Activation Functions)