Concurrency vs Parallelism: Key Differences, Examples & When to Use Each (2026)
Concurrency and parallelism are two of the most misunderstood terms in software engineering β used interchangeably in casual conversation but describing fundamentally different things. As Rob Pike, one of Go's creators, put it precisely: "Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once." One is a program design property. The other is an execution property. Getting this distinction right determines whether you reach for asyncio, threads, or multiprocessing β and getting it wrong wastes CPU cycles or introduces unnecessary complexity.
β‘ Key Takeaways
- Concurrency β structuring a program to manage multiple tasks whose execution overlaps in time. Does not require multiple cores. Can run on a single CPU via context switching.
- Parallelism β executing multiple tasks truly simultaneously. Requires multiple CPU cores or processors. Impossible on a single-core machine.
- Concurrency is a design property; parallelism is an execution property.
- Use concurrency for I/O-bound tasks (network requests, disk reads, database calls).
- Use parallelism for CPU-bound tasks (image processing, ML training, numerical computation).
- In Python, concurrency β
asyncio/threading. Parallelism βmultiprocessing(bypasses the GIL). - In web scraping, concurrency dramatically reduces total crawl time; parallelism helps with heavy data processing pipelines.
Core Definitions
π΅ Concurrency
- Multiple tasks are in progress at the same time
- Tasks may not execute at the exact same instant
- A single CPU switches between tasks via context switching
- Creates an illusion of simultaneity
- About program structure and design
- Works on single-core and multi-core hardware
- Examples: event loop, async/await, coroutines, threads (on 1 core)
π’ Parallelism
- Multiple tasks execute at the exact same physical instant
- Requires multiple CPU cores or processors
- True simultaneous execution β not an illusion
- About program execution and hardware utilisation
- Requires work that can be decomposed into independent chunks
- Introduces overhead: synchronisation, inter-process communication
- Examples: multiprocessing, GPU compute, distributed systems
A useful mental model: concurrency is a single chef efficiently switching between four pots β sautΓ©ing, then stirring, then chopping β so all dishes make progress. Parallelism is four chefs, each working on one dish simultaneously. The single chef is concurrent; the four chefs are parallel. Both approaches can produce four complete dishes, but through entirely different mechanisms and subject to different constraints.[1]
Visual Explanation: Task Execution Timeline
The clearest way to see the difference is a timeline of two tasks (A and B) on a single core vs two cores:
Single Core β Concurrent (interleaved)
Tasks A and B take turns on one core β concurrent but not parallel. Both make progress; neither runs simultaneously.
Two Cores β Parallel (simultaneous)
Tasks A and B run on separate cores at the same physical instant β this is true parallelism.
A concurrent program can run in parallel (if hardware allows) or not (on a single core). A parallel program is inherently concurrent β but not all concurrent programs are parallel. Parallelism is a subset of concurrency at the hardware execution level.[2]
The Decisive Factor: I/O-Bound vs CPU-Bound Tasks
The practical question is never "concurrency or parallelism in the abstract" β it is "what type of work am I doing?" The answer determines which model applies:
| Task Type | Definition | Examples | Right Model | Why |
|---|---|---|---|---|
| I/O-Bound | Program spends most time waiting for external resources | HTTP requests, database queries, file reads, DNS lookups | Concurrency (async/await, threads) |
While one task waits, another can use the CPU. Adding cores gives no benefit β the bottleneck is I/O speed, not compute. |
| CPU-Bound | Program spends most time computing β CPU is always busy | Image processing, ML training, video encoding, numerical computation | Parallelism (multiprocessing, GPU) |
More cores = faster completion. Concurrency alone cannot speed up CPU-saturated work on a single core. |
Task type classification per freeCodeCamp concurrency guide; I/O vs CPU analysis per Bright Data (Dec 2025).
Applying parallelism to I/O-bound tasks wastes resources β the work is not CPU-limited, so adding cores does not help. Applying concurrency to CPU-bound tasks on a single core does not speed up computation β you are just context-switching between tasks that all need the CPU. The mismatch between task type and execution model is one of the most common performance mistakes in production systems.
Concurrency vs Parallelism: Direct Comparison
| Dimension | Concurrency | Parallelism |
|---|---|---|
| Core question | Can multiple tasks make progress? | Can multiple tasks execute simultaneously? |
| Hardware requirement | Single core sufficient | Multiple cores required |
| Execution model | Context switching β one task at a time per core | True simultaneous execution on separate cores |
| Primary benefit | Responsiveness, efficient use of waiting time | Throughput, raw computational speed |
| Best for | I/O-bound tasks: HTTP, DB, file I/O | CPU-bound tasks: computation, encoding, ML |
| Overhead | Context switch overhead (lightweight) | Process creation, synchronisation, IPC (heavier) |
| Python primitives | asyncio, threading |
multiprocessing, concurrent.futures.ProcessPoolExecutor |
| Node.js model | Event loop β concurrent by default for all I/O | worker_threads, child_process |
| Go model | Goroutines on M:N scheduler | Goroutines across multiple OS threads (GOMAXPROCS) |
| Relationship | Parallelism is a subset of concurrency. All parallel programs are concurrent; not all concurrent programs are parallel. | |
Common Myths Debunked
"Concurrency = parallelism"
Concurrency is about program structure; parallelism is about hardware execution. A concurrent program running on a single core is not parallel.
"Multi-threading is parallelism"
On a single-core CPU, threads run concurrently via context switching β not in parallel. True thread-level parallelism requires multiple cores. Python threads additionally face the GIL.
"Parallelism is always faster"
For I/O-bound tasks, parallelism adds process/thread management overhead without benefit β the bottleneck is waiting, not computing. Concurrency (async I/O) is faster here.
"Async/await is parallelism"
async/await is concurrency. It allows a single thread to manage multiple I/O operations by yielding control while waiting β no simultaneous CPU execution occurs.
"Python can't do parallelism"
Python's GIL prevents true thread-level parallelism, but the multiprocessing module spawns separate processes (each with its own GIL), enabling real CPU parallelism.
"You must choose one or the other"
Most production systems combine both. A web server handles concurrent I/O requests (event loop) while offloading CPU-heavy processing to a parallel worker pool.
Python Examples: Concurrency and Parallelism
Concurrency: asyncio for I/O-Bound Tasks
Fetching 100 URLs concurrently with asyncio β a single thread manages all 100 connections, yielding between them while waiting for responses:
import asyncio import aiohttp # Concurrency: single thread manages 100 HTTP requests concurrently # Total time β slowest single request (not 100Γ single request) async def fetch(session, url): async with session.get(url) as resp: return await resp.text() async def main(urls): async with aiohttp.ClientSession() as session: # All 100 requests start "simultaneously" β CPU yields on each await tasks = [fetch(session, url) for url in urls] results = await asyncio.gather(*tasks) return results # With proxy integration for scraping async def fetch_with_proxy(session, url, proxy): async with session.get(url, proxy=proxy) as resp: return await resp.text() # proxy = "http://user:pass@gate.nstproxy.io:8080"
Parallelism: multiprocessing for CPU-Bound Tasks
Processing 1 million data points across all available CPU cores in parallel:
from multiprocessing import Pool import os # Parallelism: each worker runs on a separate CPU core # GIL bypassed β true simultaneous execution def compute_chunk(chunk): # CPU-intensive work: e.g. parsing, transforming, ML feature extraction return [item ** 2 + item ** 0.5 for item in chunk] def parallel_process(data): cpu_count = os.cpu_count() chunk_size = len(data) // cpu_count chunks = [data[i:i+chunk_size] for i in range(0, len(data), chunk_size)] with Pool(processes=cpu_count) as pool: results = pool.map(compute_chunk, chunks) # Flatten results from all workers return [item for sublist in results for item in sublist] data = list(range(1_000_000)) results = parallel_process(data) print(f"Processed {len(results)} items on {os.cpu_count()} cores")
Concurrency in Node.js: The Event Loop
Node.js is the clearest real-world example of concurrency without parallelism in its default model. A single JavaScript thread powers the event loop β handling thousands of concurrent HTTP connections, database queries, and file reads simultaneously, but executing only one JavaScript function at a time.
// Node.js event loop β concurrent I/O, single JS thread // All three fetch() calls start before any awaits on their results const fetchData = async (url) => { const res = await fetch(url); // yields control while waiting return res.json(); }; // Concurrent: all three requests start simultaneously // Single thread manages all three via the event loop const [prices, inventory, competitors] = await Promise.all([ fetchData('https://api.example.com/prices'), fetchData('https://api.example.com/inventory'), fetchData('https://api.example.com/competitors'), ]); // For CPU-bound work in Node.js β use worker_threads for parallelism const { Worker } = require('worker_threads'); // Spawns a new OS thread β true parallelism, separate V8 instance
Real-World Use Cases
Web Server Handling Requests (Concurrency)
A Node.js or Python asyncio web server receives 10,000 simultaneous HTTP requests. Each request triggers a database query (I/O-bound). The event loop handles all 10,000 concurrently β yielding between DB waits. Adding more CPU cores helps only marginally; the bottleneck is the database, not the CPU.
Image Processing Pipeline (Parallelism)
A media platform resizes 50,000 uploaded images to five different resolutions. Each resize operation is CPU-bound. Distributing the work across 8 cores with multiprocessing.Pool reduces wall-clock time from 80 minutes to ~10 minutes. Concurrency alone would not help β the CPU is the bottleneck, not I/O.
Data Pipeline: Fetch + Process (Both Combined)
A market intelligence platform fetches 50,000 product pages (I/O-bound β concurrency with asyncio), then parses and transforms each response (CPU-bound β parallelism with ProcessPoolExecutor). The two stages use different execution models, each optimised for its task type.
ML Model Training (Parallelism + Distributed)
Training a large language model distributes matrix multiplication across hundreds of GPU cores simultaneously (data parallelism). Within each GPU, thousands of CUDA cores execute in parallel. This is pure parallelism at massive scale β the work is computationally intensive with no meaningful I/O waiting.
When to Use Concurrency vs Parallelism
π΅ Use Concurrency When:
β’ Tasks spend significant time waiting (network, disk, DB)
β’ You need to handle many simultaneous users or requests
β’ Responsiveness matters more than raw throughput
β’ You are constrained to a single-core environment
β’ Using Node.js, Python asyncio, or Go goroutines
β’ Web scraping with many concurrent HTTP requests
π’ Use Parallelism When:
β’ Work is computationally intensive (CPU never idle)
β’ Problems decompose into independent, equal-sized chunks
β’ You need to maximise use of multi-core hardware
β’ Doing image/video processing, numerical computation
β’ Python multiprocessing bypasses the GIL
β’ Training ML models or processing large datasets
Concurrency & Parallelism in Web Scraping
Web scraping is primarily an I/O-bound workflow β your scraper spends most of its time waiting for HTTP responses, not computing. This makes concurrency the correct model for the request phase, and parallelism potentially useful for the processing phase.
- Request phase (I/O-bound) β Concurrency. Use
asyncio+aiohttpin Python, orPromise.all()in Node.js. A single thread managing 100 concurrent requests completes the crawl in roughly the time of the slowest single request β versus 100Γ that time sequentially. Adding more CPU cores here provides negligible benefit. - Parse/transform phase (CPU-bound) β Parallelism. If each scraped page requires heavy parsing, NLP extraction, or data transformation, distribute this work across CPU cores using
ProcessPoolExecutor. - Proxy rotation. Concurrency means more requests per second β which means faster IP depletion on a single proxy. Pair concurrent scrapers with a large residential proxy pool to ensure sufficient IP variety across the concurrent request volume. See the IP rotation guide for patterns.
# Production scraping: concurrency for I/O + residential proxies import asyncio, aiohttp, random PROXIES = [ "http://user:pass@gate.nstproxy.io:8080", "http://user:pass@gate.nstproxy.io:8081", # Add more proxy endpoints or use rotating gateway ] async def fetch(session, url, semaphore): async with semaphore: # limit concurrent requests to avoid rate limits proxy = random.choice(PROXIES) try: async with session.get(url, proxy=proxy, timeout=aiohttp.ClientTimeout(total=15)) as resp: return await resp.text() except Exception as e: return None async def crawl(urls, concurrency=50): semaphore = asyncio.Semaphore(concurrency) # 50 concurrent requests max async with aiohttp.ClientSession() as session: tasks = [fetch(session, url, semaphore) for url in urls] return await asyncio.gather(*tasks)
Scale Concurrent Scrapers with Clean Residential IPs
High concurrency means more requests per second β which means faster IP rotation requirements. Nstproxy's 110M+ residential IP pool supports concurrent scraping at any scale without IP exhaustion.
Try Nstproxy for Free βFAQ
Concurrency: one chef switching between four pots β all dishes make progress, but only one is being worked on at any given moment. Parallelism: four chefs, each working on one dish at the same time. Concurrency is about managing multiple tasks in overlapping time periods. Parallelism is about executing multiple tasks at the exact same physical instant.
Yes β and this is the common case on single-core hardware or in event-driven runtimes. A Node.js event loop handles thousands of concurrent requests on a single thread. An OS running multiple applications on a single-core CPU time-slices between them. Both are concurrent; neither is parallel. Concurrency only requires that tasks' execution periods overlap, not that they occur simultaneously.
Python's Global Interpreter Lock (GIL) allows only one thread to execute Python bytecode at a time, even on multi-core hardware. This makes Python threads concurrent but not parallel for CPU-bound work. The solution is multiprocessing β which spawns separate OS processes, each with its own interpreter and GIL β enabling genuine CPU parallelism. For I/O-bound tasks, threads still work well because the GIL is released during I/O waits.
Web scraping is primarily I/O-bound, so async/await with asyncio is the correct choice for the request phase β a single thread managing 50β200 concurrent HTTP connections outperforms 50 threads or 50 processes for the same task with significantly less overhead. Use multiprocessing only for CPU-intensive post-processing steps (heavy parsing, data transformation, ML inference) once the pages have been fetched.
Both, depending on runtime configuration. Goroutines run on Go's M:N scheduler β multiple goroutines are multiplexed onto OS threads. By default, Go sets GOMAXPROCS to the number of available CPU cores, so goroutines run concurrently across multiple threads in parallel. On a single core (GOMAXPROCS=1), goroutines are concurrent but not parallel. This is why Go is often cited as having the best practical model for both concurrency and parallelism.

