Rotate Proxies in Python: requests, Scrapy & Smart Rotation Strategies (2026)
Proxy rotation is the practice of switching IP addresses across requests to distribute load, avoid rate limits, and prevent any single IP from accumulating enough request volume to get flagged. A naive implementation — picking a random proxy from a list — works for small projects but breaks down at scale. This guide covers proxy rotation in Python from basic random selection through weighted health-based rotation, subnet/ASN diversity, sticky sessions, Scrapy middleware, async rotation, and the gateway vs. self-managed pool tradeoff.
⚡ Key Takeaways
- Basic rotation uses
random.choice()on a proxy list — sufficient for small projects, insufficient at scale. - Weighted rotation with
random.choices()prioritises healthy proxies over failing ones using tracked success rates. - Anti-bot systems check subnet and ASN patterns — randomising only by IP (not by subnet) lets consecutive requests fall in the same /24 block, undermining rotation.[1]
- Sticky sessions (same IP for a sequence of related requests) are required for multi-step flows like logins or checkouts — pure rotation breaks these flows.
- For Scrapy, the
scrapy-rotating-proxiesmiddleware handles rotation, dead-proxy detection, and automatic backoff out of the box.[2] - A rotating gateway (single endpoint, provider rotates IP per connection) is simpler to integrate; a self-managed pool (list of individual proxies) gives more granular health control.[3]
Why Rotate Proxies?
Web scraping at scale requires distributing requests across many IP addresses to avoid rate limiting, throttling, or outright blocking. A single IP making hundreds of requests per minute is trivially detectable — proxy rotation makes each request appear to originate from a different user, spreading the load and the risk across a pool.[4] Beyond avoiding blocks, rotation also enables geo-targeted scraping (different IPs in different regions) and reduces the financial/operational impact of any single proxy going bad.
Basic Proxy Rotation with the requests Library
The simplest implementation picks a random proxy from a list for every outgoing request:
import random import requests PROXIES = [ "http://user:pass@proxy1.example.com:8000", "http://user:pass@proxy2.example.com:8000", "http://user:pass@proxy3.example.com:8000", ] def get_random_proxy(): return random.choice(PROXIES) def fetch(url): proxy = get_random_proxy() proxies = {"http": proxy, "https": proxy} try: response = requests.get(url, proxies=proxies, timeout=15) return response except requests.RequestException as e: print(f"Proxy {proxy} failed: {e}") return None # Each call to fetch() may use a different proxy for url in urls: fetch(url)
This works for a handful of proxies on small, unprotected targets — but has no concept of which proxies are dead, no weighting toward better-performing IPs, and no awareness that consecutive requests from the same subnet can still trigger detection even when the exact IP changes.
Weighted Rotation: Prioritise Healthy Proxies
Tracking proxy performance and biasing selection toward IPs with higher success rates dramatically improves reliability versus pure random selection. Python's random.choices() (plural) accepts a weights parameter for exactly this purpose:[1]
import random import time class ProxyPool: def __init__(self, proxies): self.proxies = proxies # Track success/fail counts per proxy self.stats = {p: {"success": 1, "fail": 0, "banned_until": 0} for p in proxies} def _weight(self, proxy): s = self.stats[proxy] if time.time() < s["banned_until"]: return 0.0 # temporarily banned — zero weight total = s["success"] + s["fail"] return s["success"] / total if total > 0 else 1.0 def get_proxy(self): weights = [self._weight(p) for p in self.proxies] if sum(weights) == 0: raise RuntimeError("All proxies are currently banned") return random.choices(self.proxies, weights=weights, k=1)[0] def report_success(self, proxy): self.stats[proxy]["success"] += 1 def report_failure(self, proxy, ban_seconds=300): self.stats[proxy]["fail"] += 1 # Temporarily ban a proxy after repeated failures if self.stats[proxy]["fail"] >= 3: self.stats[proxy]["banned_until"] = time.time() + ban_seconds pool = ProxyPool(PROXIES) proxy = pool.get_proxy() # ... make request ... # pool.report_success(proxy) or pool.report_failure(proxy)
This pattern self-corrects: proxies that consistently fail get a lower weight (and a temporary ban after repeated failures), while reliable proxies are selected more often — without ever fully removing a proxy permanently, since temporary blocks often resolve over time.
Subnet and ASN Diversity: The Overlooked Detail
Random selection across a proxy list does not guarantee subnet diversity. If your pool contains many IPs from the same provider, consecutive "random" selections can still land in the same /24 subnet or the same Autonomous System Number (ASN) — and many anti-bot systems specifically check for clustering at the subnet/ASN level, not just per-IP.[1]
import random from collections import deque class SubnetAwarePool: def __init__(self, proxies, history_size=5): self.proxies = proxies self.recent_subnets = deque(maxlen=history_size) def _subnet(self, proxy_ip): # Extract first 3 octets — the /24 subnet identifier return ".".join(proxy_ip.split(".")[:3]) def get_proxy(self): for _ in range(10): # try up to 10 times to find subnet diversity candidate = random.choice(self.proxies) ip = candidate.split("@")[-1].split(":")[0] subnet = self._subnet(ip) if subnet not in self.recent_subnets: self.recent_subnets.append(subnet) return candidate # Fallback: accept repeat subnet if no diverse option found return random.choice(self.proxies)
Sticky Sessions: When NOT to Rotate
Some workflows require the same IP across a sequence of requests — logins, multi-step checkouts, paginated scraping with session cookies. Rotating mid-sequence breaks these flows or triggers security alerts (a login from one IP followed immediately by a cart action from a different IP looks like account takeover).[5]
import requests class StickySession: def __init__(self, proxy_pool): self.pool = proxy_pool self.sessions = {} # slot_id -> (session, proxy) def get_session(self, slot_id): if slot_id not in self.sessions: proxy = self.pool.get_proxy() session = requests.Session() session.proxies = {"http": proxy, "https": proxy} self.sessions[slot_id] = (session, proxy) return self.sessions[slot_id][0] def release(self, slot_id): # Call after the multi-step flow completes — frees the slot for reuse if slot_id in self.sessions: del self.sessions[slot_id] sticky = StickySession(pool) # All requests for "account_42" use the SAME proxy until released session = sticky.get_session("account_42") session.get("https://example.com/login") session.post("https://example.com/cart/add", data={...}) session.post("https://example.com/checkout", data={...}) sticky.release("account_42")
Most commercial residential proxy providers, including Nstproxy, offer sticky session support natively at the gateway level (a session ID parameter that pins the same exit IP for a configurable duration), eliminating the need to manage this manually.
Proxy Health Tracking and Ban Detection
Detecting whether a proxy is "dead" or banned is site-specific — a 403 might mean the proxy is fine but the site detected the request pattern, while a connection timeout might mean the proxy itself is offline. scrapy-rotating-proxies' default heuristic — non-200 status, empty body, or exception means dead — is a reasonable starting point that you should customise per target.[2]
def is_banned(response, exception=None) -> bool: if exception: return True # connection failure, timeout, etc — likely a dead proxy if response.status_code in (403, 429, 503): return True # common anti-bot block codes if not response.text.strip(): return True # empty body — proxy likely returned nothing useful if "captcha" in response.text.lower(): return True # CAPTCHA challenge — request pattern flagged return False
Rotating Proxies in Scrapy
Scrapy has two main approaches to proxy rotation: a custom middleware, or the well-maintained scrapy-rotating-proxies package.
Method 1: Custom Middleware
# middlewares.py import random class ProxyMiddleware: def __init__(self, proxy_list): self.proxy_list = proxy_list @classmethod def from_crawler(cls, crawler): return cls(proxy_list=crawler.settings.get('ROTATING_PROXY_LIST')) def process_request(self, request, spider): request.meta['proxy'] = random.choice(self.proxy_list)
# settings.py DOWNLOADER_MIDDLEWARES = { 'myproject.middlewares.ProxyMiddleware': 350, } ROTATING_PROXY_LIST = [ 'http://user:pass@proxy1.example.com:8000', 'http://user:pass@proxy2.example.com:8000', ]
Method 2: scrapy-rotating-proxies (Recommended)
pip install scrapy-rotating-proxies
# settings.py ROTATING_PROXY_LIST = [ 'proxy1.com:8000', 'https://proxy2.com:8000', 'login:password@proxy3.com:8031', ] # Or load from a file instead: # ROTATING_PROXY_LIST_PATH = 'proxies.txt' DOWNLOADER_MIDDLEWARES = { 'rotating_proxies.middlewares.RotatingProxyMiddleware': 610, 'rotating_proxies.middlewares.BanDetectionMiddleware': 620, } ROTATING_PROXY_BACKOFF_BASE = 300 # 5 min base backoff for dead proxies ROTATING_PROXY_BACKOFF_CAP = 3600 # 60 min max backoff
This package automatically tracks working vs. non-working proxies, retries failed proxies after a backoff period, and exposes a customisable BanDetectionPolicy for site-specific ban logic — without writing any rotation code yourself.[2] Crucially, all default Scrapy concurrency settings (DOWNLOAD_DELAY, CONCURRENT_REQUESTS_PER_DOMAIN) automatically become per-proxy once this middleware is enabled — meaning CONCURRENT_REQUESTS_PER_DOMAIN=2 allows 2 concurrent connections to each proxy, not 2 total.[6]
Custom Ban Detection Policy
# myproject/policy.py from rotating_proxies.policy import BanDetectionPolicy class MyPolicy(BanDetectionPolicy): def response_is_ban(self, request, response): ban = super().response_is_ban(request, response) ban = ban or b'captcha' in response.body return ban def exception_is_ban(self, request, exception): return None # don't treat exceptions as bans for this target
# settings.py ROTATING_PROXY_BAN_POLICY = 'myproject.policy.MyPolicy'
Async Proxy Rotation with aiohttp
For high-throughput scraping, combining proxy rotation with asyncio concurrency (rather than threads or processes) is the standard pattern — see Nstproxy's IP rotation guide for additional context on this approach:
import asyncio import aiohttp import random PROXIES = [ "http://user:pass@gate.nstproxy.io:8080", "http://user:pass@gate.nstproxy.io:8081", ] async def fetch(session, url, semaphore): async with semaphore: proxy = random.choice(PROXIES) try: async with session.get(url, proxy=proxy, timeout=aiohttp.ClientTimeout(total=15)) as resp: return await resp.text() except Exception: return None async def crawl(urls, concurrency=50): semaphore = asyncio.Semaphore(concurrency) async with aiohttp.ClientSession() as session: tasks = [fetch(session, url, semaphore) for url in urls] return await asyncio.gather(*tasks)
Jitter and Request Timing
Proxy rotation alone is not sufficient if request timing remains mechanical. Anti-bot systems look for the unnaturally regular cadence of automated requests — if requests arrive at exactly 1.0-second intervals regardless of which IP sends them, the pattern itself is the detection signal, not the IP.[5]
import random import time def jittered_delay(base=2.0, jitter_std=0.5): # Gaussian jitter around a base delay — mimics human variability delay = random.gauss(base, jitter_std) return max(0.1, delay) # floor to avoid negative/zero delays for url in urls: fetch(url) time.sleep(jittered_delay(base=2.0, jitter_std=0.5))
Combine jitter with proxy rotation, varying User-Agent strings, and realistic header ordering for a request pattern that resists both IP-based and behavioural detection layers.
Rotating Gateway vs. Self-Managed Pool
There are two fundamentally different architectures for proxy rotation, and the right choice depends on how much control you need versus how much you want to manage yourself:[5]
| Factor | Rotating Gateway | Self-Managed Pool |
|---|---|---|
| Setup | Single endpoint — provider rotates IP per connection automatically | Maintain a list of individual proxy endpoints yourself |
| Code complexity | Minimal — point at one URL, rotation is invisible | Higher — you implement selection, health tracking, retry logic |
| Granular health control | Limited — you can't see or control individual IP health | Full — track success/fail per IP, blacklist specific bad ones |
| Sticky sessions | Usually supported via session ID parameter | You implement session pinning manually (see above) |
| Best for | Most use cases — simpler, less to maintain | High-scale operations needing fine-grained control and custom ban logic |
Residential Proxy Integration for Rotation
The rotation logic above is protocol-agnostic — it works the same whether the underlying proxies are datacenter or residential. What changes is the success rate: rotating a pool of datacenter IPs across a protected target still produces frequent bans because each individual IP carries low trust regardless of how cleverly you rotate it. Residential IPs raise the baseline trust of every single request in the rotation.
# Nstproxy residential rotating gateway — simplest integration import requests PROXY = "http://username:password@gate.nstproxy.io:8080" # Each new connection through this gateway gets a different residential IP # automatically — no rotation logic needed in your code def fetch(url): proxies = {"http": PROXY, "https": PROXY} response = requests.get(url, proxies=proxies, timeout=15) return response # Sticky session: append a session ID to pin the same IP for N minutes STICKY_PROXY = "http://username-session-abc123:password@gate.nstproxy.io:8080"
Nstproxy's 110M+ residential IP pool spans 195 countries and thousands of ISPs and subnets — meaning the subnet/ASN diversity problem described earlier is effectively solved at the infrastructure layer. See the residential proxy overview and high-anonymity proxy guide for further integration detail.
Rotate Through 110M+ Clean Residential IPs
Skip the subnet-diversity engineering — Nstproxy's residential proxy network spans thousands of ISPs across 195 countries, with built-in rotating and sticky session support.
Try Nstproxy for Free →FAQ
Maintain a list of proxy URLs and use random.choice() to pick a different one before each request with the requests library. This works for small projects, but lacks health tracking, weighting, and subnet awareness — for production use, implement weighted selection based on tracked success rates, or use a commercial rotating gateway that handles this automatically.
The recommended approach is the scrapy-rotating-proxies package: install it with pip, set ROTATING_PROXY_LIST in settings.py with your proxy URLs, and add RotatingProxyMiddleware and BanDetectionMiddleware to DOWNLOADER_MIDDLEWARES. It automatically tracks dead proxies, applies exponential backoff, and retries — no custom rotation code needed. For full control over ban detection logic, write a custom BanDetectionPolicy class.
Three common causes: (1) subnet/ASN clustering — your "randomised" proxies may share the same /24 subnet or ASN, which anti-bot systems detect as a pattern even though individual IPs differ; (2) mechanical request timing — fixed intervals between requests are detectable regardless of IP rotation; (3) datacenter IP reputation — rotating among datacenter IPs doesn't help if every individual IP in the pool carries low trust on a heavily protected target. Residential proxy rotation addresses the third cause directly.
Not always. Rotate per-request for high-volume, independent scraping tasks (SERP scraping, price monitoring) where each request stands alone. Use sticky sessions (same IP across a sequence) for multi-step workflows — logins, checkouts, paginated browsing with session cookies — where switching IPs mid-sequence breaks the flow or looks like account takeover to the target site's security systems.
A rotating gateway is a single endpoint URL where the provider automatically assigns a different IP on each new connection — you write almost no rotation code. A self-managed pool is a list of individual proxy endpoints that you select from yourself, giving you full visibility and control over individual IP health, but requiring you to implement selection logic, ban detection, and retry handling. Most use cases are well served by a rotating gateway; high-scale operations needing granular control benefit from a self-managed pool.
Further Reading
Sources
- Scrapfly — How to Rotate Proxies in Web Scraping (April 2026)
- PyPI — scrapy-rotating-proxies Documentation
- DEV Community — Engineering Resilient Proxy Rotation Systems (March 2026)
- Webshare — Proxy Rotator in Python Requests
- DEV Community — Contextual Persistence & Jitter Techniques
- GitHub — TeamHG-Memex/scrapy-rotating-proxies

