Rotate Proxies in Python: requests, Scrapy & Smart Strategies 2026

Rotate Proxies in Python: requests, Scrapy & Smart Rotation Strategies (2026)

Proxy rotation is the practice of switching IP addresses across requests to distribute load, avoid rate limits, and prevent any single IP from accumulating enough request volume to get flagged. A naive implementation — picking a random proxy from a list — works for small projects but breaks down at scale. This guide covers proxy rotation in Python from basic random selection through weighted health-based rotation, subnet/ASN diversity, sticky sessions, Scrapy middleware, async rotation, and the gateway vs. self-managed pool tradeoff.

⚡ Key Takeaways

Basic rotation uses random.choice() on a proxy list — sufficient for small projects, insufficient at scale.
Weighted rotation with random.choices() prioritises healthy proxies over failing ones using tracked success rates.
Anti-bot systems check subnet and ASN patterns — randomising only by IP (not by subnet) lets consecutive requests fall in the same /24 block, undermining rotation.^[1]
Sticky sessions (same IP for a sequence of related requests) are required for multi-step flows like logins or checkouts — pure rotation breaks these flows.
For Scrapy, the scrapy-rotating-proxies middleware handles rotation, dead-proxy detection, and automatic backoff out of the box.^[2]
A rotating gateway (single endpoint, provider rotates IP per connection) is simpler to integrate; a self-managed pool (list of individual proxies) gives more granular health control.^[3]

Why Rotate Proxies?

Web scraping at scale requires distributing requests across many IP addresses to avoid rate limiting, throttling, or outright blocking. A single IP making hundreds of requests per minute is trivially detectable — proxy rotation makes each request appear to originate from a different user, spreading the load and the risk across a pool.^[4] Beyond avoiding blocks, rotation also enables geo-targeted scraping (different IPs in different regions) and reduces the financial/operational impact of any single proxy going bad.

Basic Proxy Rotation with the requests Library

The simplest implementation picks a random proxy from a list for every outgoing request:

import random
import requests

PROXIES = [
    "http://user:pass@proxy1.example.com:8000",
    "http://user:pass@proxy2.example.com:8000",
    "http://user:pass@proxy3.example.com:8000",
]

def get_random_proxy():
    return random.choice(PROXIES)

def fetch(url):
    proxy = get_random_proxy()
    proxies = {"http": proxy, "https": proxy}
    try:
        response = requests.get(url, proxies=proxies, timeout=15)
        return response
    except requests.RequestException as e:
        print(f"Proxy {proxy} failed: {e}")
        return None

# Each call to fetch() may use a different proxy
for url in urls:
    fetch(url)

This works for a handful of proxies on small, unprotected targets — but has no concept of which proxies are dead, no weighting toward better-performing IPs, and no awareness that consecutive requests from the same subnet can still trigger detection even when the exact IP changes.

Weighted Rotation: Prioritise Healthy Proxies

Tracking proxy performance and biasing selection toward IPs with higher success rates dramatically improves reliability versus pure random selection. Python's random.choices() (plural) accepts a weights parameter for exactly this purpose:^[1]

import random
import time

class ProxyPool:
    def __init__(self, proxies):
        self.proxies = proxies
        # Track success/fail counts per proxy
        self.stats = {p: {"success": 1, "fail": 0, "banned_until": 0} for p in proxies}

    def _weight(self, proxy):
        s = self.stats[proxy]
        if time.time() < s["banned_until"]:
            return 0.0  # temporarily banned — zero weight
        total = s["success"] + s["fail"]
        return s["success"] / total if total > 0 else 1.0

    def get_proxy(self):
        weights = [self._weight(p) for p in self.proxies]
        if sum(weights) == 0:
            raise RuntimeError("All proxies are currently banned")
        return random.choices(self.proxies, weights=weights, k=1)[0]

    def report_success(self, proxy):
        self.stats[proxy]["success"] += 1

    def report_failure(self, proxy, ban_seconds=300):
        self.stats[proxy]["fail"] += 1
        # Temporarily ban a proxy after repeated failures
        if self.stats[proxy]["fail"] >= 3:
            self.stats[proxy]["banned_until"] = time.time() + ban_seconds

pool = ProxyPool(PROXIES)
proxy = pool.get_proxy()
# ... make request ...
# pool.report_success(proxy)  or  pool.report_failure(proxy)

This pattern self-corrects: proxies that consistently fail get a lower weight (and a temporary ban after repeated failures), while reliable proxies are selected more often — without ever fully removing a proxy permanently, since temporary blocks often resolve over time.

Subnet and ASN Diversity: The Overlooked Detail

Random selection across a proxy list does not guarantee subnet diversity. If your pool contains many IPs from the same provider, consecutive "random" selections can still land in the same /24 subnet or the same Autonomous System Number (ASN) — and many anti-bot systems specifically check for clustering at the subnet/ASN level, not just per-IP.^[1]

import random
from collections import deque

class SubnetAwarePool:
    def __init__(self, proxies, history_size=5):
        self.proxies = proxies
        self.recent_subnets = deque(maxlen=history_size)

    def _subnet(self, proxy_ip):
        # Extract first 3 octets — the /24 subnet identifier
        return ".".join(proxy_ip.split(".")[:3])

    def get_proxy(self):
        for _ in range(10):  # try up to 10 times to find subnet diversity
            candidate = random.choice(self.proxies)
            ip = candidate.split("@")[-1].split(":")[0]
            subnet = self._subnet(ip)
            if subnet not in self.recent_subnets:
                self.recent_subnets.append(subnet)
                return candidate
        # Fallback: accept repeat subnet if no diverse option found
        return random.choice(self.proxies)

💡 In practice: This level of subnet management matters most when you maintain your own proxy list across multiple datacenter providers. With a residential proxy provider offering a large, geographically diverse IP pool (like Nstproxy's 110M+ IPs), subnet clustering is far less of a concern because the underlying pool already spans thousands of different ISPs and subnets.

Sticky Sessions: When NOT to Rotate

Some workflows require the same IP across a sequence of requests — logins, multi-step checkouts, paginated scraping with session cookies. Rotating mid-sequence breaks these flows or triggers security alerts (a login from one IP followed immediately by a cart action from a different IP looks like account takeover).^[5]

import requests

class StickySession:
    def __init__(self, proxy_pool):
        self.pool = proxy_pool
        self.sessions = {}  # slot_id -> (session, proxy)

    def get_session(self, slot_id):
        if slot_id not in self.sessions:
            proxy = self.pool.get_proxy()
            session = requests.Session()
            session.proxies = {"http": proxy, "https": proxy}
            self.sessions[slot_id] = (session, proxy)
        return self.sessions[slot_id][0]

    def release(self, slot_id):
        # Call after the multi-step flow completes — frees the slot for reuse
        if slot_id in self.sessions:
            del self.sessions[slot_id]

sticky = StickySession(pool)

# All requests for "account_42" use the SAME proxy until released
session = sticky.get_session("account_42")
session.get("https://example.com/login")
session.post("https://example.com/cart/add", data={...})
session.post("https://example.com/checkout", data={...})
sticky.release("account_42")

Most commercial residential proxy providers, including Nstproxy, offer sticky session support natively at the gateway level (a session ID parameter that pins the same exit IP for a configurable duration), eliminating the need to manage this manually.

Proxy Health Tracking and Ban Detection

Detecting whether a proxy is "dead" or banned is site-specific — a 403 might mean the proxy is fine but the site detected the request pattern, while a connection timeout might mean the proxy itself is offline. scrapy-rotating-proxies' default heuristic — non-200 status, empty body, or exception means dead — is a reasonable starting point that you should customise per target.^[2]

def is_banned(response, exception=None) -> bool:
    if exception:
        return True  # connection failure, timeout, etc — likely a dead proxy
    if response.status_code in (403, 429, 503):
        return True  # common anti-bot block codes
    if not response.text.strip():
        return True  # empty body — proxy likely returned nothing useful
    if "captcha" in response.text.lower():
        return True  # CAPTCHA challenge — request pattern flagged
    return False

Rotating Proxies in Scrapy

Scrapy has two main approaches to proxy rotation: a custom middleware, or the well-maintained scrapy-rotating-proxies package.

Method 1: Custom Middleware

# middlewares.py
import random

class ProxyMiddleware:
    def __init__(self, proxy_list):
        self.proxy_list = proxy_list

    @classmethod
    def from_crawler(cls, crawler):
        return cls(proxy_list=crawler.settings.get('ROTATING_PROXY_LIST'))

    def process_request(self, request, spider):
        request.meta['proxy'] = random.choice(self.proxy_list)

# settings.py
DOWNLOADER_MIDDLEWARES = {
    'myproject.middlewares.ProxyMiddleware': 350,
}
ROTATING_PROXY_LIST = [
    'http://user:pass@proxy1.example.com:8000',
    'http://user:pass@proxy2.example.com:8000',
]

Method 2: scrapy-rotating-proxies (Recommended)

pip install scrapy-rotating-proxies

# settings.py
ROTATING_PROXY_LIST = [
    'proxy1.com:8000',
    'https://proxy2.com:8000',
    'login:password@proxy3.com:8031',
]
# Or load from a file instead:
# ROTATING_PROXY_LIST_PATH = 'proxies.txt'

DOWNLOADER_MIDDLEWARES = {
    'rotating_proxies.middlewares.RotatingProxyMiddleware': 610,
    'rotating_proxies.middlewares.BanDetectionMiddleware': 620,
}

ROTATING_PROXY_BACKOFF_BASE = 300   # 5 min base backoff for dead proxies
ROTATING_PROXY_BACKOFF_CAP   = 3600 # 60 min max backoff

This package automatically tracks working vs. non-working proxies, retries failed proxies after a backoff period, and exposes a customisable BanDetectionPolicy for site-specific ban logic — without writing any rotation code yourself.^[2] Crucially, all default Scrapy concurrency settings (DOWNLOAD_DELAY, CONCURRENT_REQUESTS_PER_DOMAIN) automatically become per-proxy once this middleware is enabled — meaning CONCURRENT_REQUESTS_PER_DOMAIN=2 allows 2 concurrent connections to each proxy, not 2 total.^[6]

Custom Ban Detection Policy

# myproject/policy.py
from rotating_proxies.policy import BanDetectionPolicy

class MyPolicy(BanDetectionPolicy):
    def response_is_ban(self, request, response):
        ban = super().response_is_ban(request, response)
        ban = ban or b'captcha' in response.body
        return ban

    def exception_is_ban(self, request, exception):
        return None  # don't treat exceptions as bans for this target

# settings.py
ROTATING_PROXY_BAN_POLICY = 'myproject.policy.MyPolicy'

Async Proxy Rotation with aiohttp

For high-throughput scraping, combining proxy rotation with asyncio concurrency (rather than threads or processes) is the standard pattern — see Nstproxy's IP rotation guide for additional context on this approach:

import asyncio
import aiohttp
import random

PROXIES = [
    "http://user:pass@gate.nstproxy.io:8080",
    "http://user:pass@gate.nstproxy.io:8081",
]

async def fetch(session, url, semaphore):
    async with semaphore:
        proxy = random.choice(PROXIES)
        try:
            async with session.get(url, proxy=proxy,
                                   timeout=aiohttp.ClientTimeout(total=15)) as resp:
                return await resp.text()
        except Exception:
            return None

async def crawl(urls, concurrency=50):
    semaphore = asyncio.Semaphore(concurrency)
    async with aiohttp.ClientSession() as session:
        tasks = [fetch(session, url, semaphore) for url in urls]
        return await asyncio.gather(*tasks)

Jitter and Request Timing

Proxy rotation alone is not sufficient if request timing remains mechanical. Anti-bot systems look for the unnaturally regular cadence of automated requests — if requests arrive at exactly 1.0-second intervals regardless of which IP sends them, the pattern itself is the detection signal, not the IP.^[5]

import random
import time

def jittered_delay(base=2.0, jitter_std=0.5):
    # Gaussian jitter around a base delay — mimics human variability
    delay = random.gauss(base, jitter_std)
    return max(0.1, delay)  # floor to avoid negative/zero delays

for url in urls:
    fetch(url)
    time.sleep(jittered_delay(base=2.0, jitter_std=0.5))

Combine jitter with proxy rotation, varying User-Agent strings, and realistic header ordering for a request pattern that resists both IP-based and behavioural detection layers.

Rotating Gateway vs. Self-Managed Pool

There are two fundamentally different architectures for proxy rotation, and the right choice depends on how much control you need versus how much you want to manage yourself:^[5]

Factor	Rotating Gateway	Self-Managed Pool
Setup	Single endpoint — provider rotates IP per connection automatically	Maintain a list of individual proxy endpoints yourself
Code complexity	Minimal — point at one URL, rotation is invisible	Higher — you implement selection, health tracking, retry logic
Granular health control	Limited — you can't see or control individual IP health	Full — track success/fail per IP, blacklist specific bad ones
Sticky sessions	Usually supported via session ID parameter	You implement session pinning manually (see above)
Best for	Most use cases — simpler, less to maintain	High-scale operations needing fine-grained control and custom ban logic

Residential Proxy Integration for Rotation

The rotation logic above is protocol-agnostic — it works the same whether the underlying proxies are datacenter or residential. What changes is the success rate: rotating a pool of datacenter IPs across a protected target still produces frequent bans because each individual IP carries low trust regardless of how cleverly you rotate it. Residential IPs raise the baseline trust of every single request in the rotation.

# Nstproxy residential rotating gateway — simplest integration
import requests

PROXY = "http://username:password@gate.nstproxy.io:8080"
# Each new connection through this gateway gets a different residential IP
# automatically — no rotation logic needed in your code

def fetch(url):
    proxies = {"http": PROXY, "https": PROXY}
    response = requests.get(url, proxies=proxies, timeout=15)
    return response

# Sticky session: append a session ID to pin the same IP for N minutes
STICKY_PROXY = "http://username-session-abc123:password@gate.nstproxy.io:8080"

Nstproxy's 110M+ residential IP pool spans 195 countries and thousands of ISPs and subnets — meaning the subnet/ASN diversity problem described earlier is effectively solved at the infrastructure layer. See the residential proxy overview and high-anonymity proxy guide for further integration detail.

Rotate Through 110M+ Clean Residential IPs

Skip the subnet-diversity engineering — Nstproxy's residential proxy network spans thousands of ISPs across 195 countries, with built-in rotating and sticky session support.

Try Nstproxy for Free →

FAQ

Q: What is the simplest way to rotate proxies in Python?

Maintain a list of proxy URLs and use random.choice() to pick a different one before each request with the requests library. This works for small projects, but lacks health tracking, weighting, and subnet awareness — for production use, implement weighted selection based on tracked success rates, or use a commercial rotating gateway that handles this automatically.

Q: How do I rotate proxies in Scrapy?

The recommended approach is the scrapy-rotating-proxies package: install it with pip, set ROTATING_PROXY_LIST in settings.py with your proxy URLs, and add RotatingProxyMiddleware and BanDetectionMiddleware to DOWNLOADER_MIDDLEWARES. It automatically tracks dead proxies, applies exponential backoff, and retries — no custom rotation code needed. For full control over ban detection logic, write a custom BanDetectionPolicy class.

Q: Why does proxy rotation still get blocked sometimes?

Three common causes: (1) subnet/ASN clustering — your "randomised" proxies may share the same /24 subnet or ASN, which anti-bot systems detect as a pattern even though individual IPs differ; (2) mechanical request timing — fixed intervals between requests are detectable regardless of IP rotation; (3) datacenter IP reputation — rotating among datacenter IPs doesn't help if every individual IP in the pool carries low trust on a heavily protected target. Residential proxy rotation addresses the third cause directly.

Q: Should I rotate proxies for every request?

Not always. Rotate per-request for high-volume, independent scraping tasks (SERP scraping, price monitoring) where each request stands alone. Use sticky sessions (same IP across a sequence) for multi-step workflows — logins, checkouts, paginated browsing with session cookies — where switching IPs mid-sequence breaks the flow or looks like account takeover to the target site's security systems.

Q: What is the difference between a rotating gateway and a proxy pool?

A rotating gateway is a single endpoint URL where the provider automatically assigns a different IP on each new connection — you write almost no rotation code. A self-managed pool is a list of individual proxy endpoints that you select from yourself, giving you full visibility and control over individual IP health, but requiring you to implement selection logic, ban detection, and retry handling. Most use cases are well served by a rotating gateway; high-scale operations needing granular control benefit from a self-managed pool.