Node Unblocker for Web Scraping: Setup, Middleware & Proxy Integration

Node Unblocker for Web Scraping: Setup, Middleware & Proxy Integration (2026)

Node Unblocker is an open-source Node.js proxy middleware that sits between your scraper and target websites — forwarding HTTP requests, rewriting URLs and headers, and masking your real IP. It was originally built for bypassing internet censorship, but its Express-compatible API and streaming architecture make it a useful tool in lightweight Node.js scraping stacks. This guide covers setup, middleware customisation, IP rotation, Puppeteer integration, and where managed residential proxies outperform a self-hosted approach.

⚡ Key Takeaways

Node Unblocker is an npm package (unblocker) that creates an HTTP proxy server using Express — it is not a proxy service.
It streams responses without buffering, rewrites URLs and cookies automatically, and supports request/response middleware hooks.
Best for: lightweight scraping, learning proxy concepts, small projects with static or AJAX-based sites.
Not suitable for: JavaScript-heavy SPAs, sites requiring TLS fingerprint evasion (Cloudflare, Akamai), or high-volume production pipelines.
A single Node Unblocker instance has one IP — adding a commercial residential proxy behind it dramatically extends its effectiveness against IP-based blocks.^[1]
In 2026, Node Unblocker works for simpler targets but requires augmentation with residential proxies and headless browsers for serious anti-bot systems.

What Is Node Unblocker?

Node Unblocker is an open-source npm package (package name: unblocker) that provides proxy middleware for Node.js and Express applications. Originally created for bypassing internet censorship, it functions as a programmable HTTP/HTTPS proxy server — receiving requests from your scraper, fetching the target page on your behalf, rewriting all internal URLs, and streaming the response back.^[2]

The key distinction from a standard HTTP proxy is its REST API model: instead of configuring proxy settings in your scraper's network client, you route requests through a URL pattern like http://your-proxy-server/proxy/https://target.com. This makes it integrable with any HTTP client library — fetch, axios, or a headless browser — without proxy protocol configuration.

Node Unblocker processes data as a stream rather than buffering full responses, which keeps memory usage low and latency minimal even on large pages. It automatically handles URL rewriting in HTML, CSS, and JavaScript so that relative links on the proxied page continue working correctly through the proxy.

How Node Unblocker Works

The architecture is a standard forward proxy with Express middleware hooks layered on top:

Your scraper sends a request to http://proxy-server/proxy/{target-url}.
Node Unblocker strips the /proxy/ prefix, extracts the target URL, and opens a new outbound HTTP request to that URL using the proxy server's IP.
The response streams back through Node Unblocker. Before delivery, it rewrites all URLs in the HTML/CSS/JS to maintain the /proxy/ prefix — so all subsequent resource requests (images, scripts, AJAX) also route through the proxy.
Middleware hooks fire on both the request (before forwarding) and the response (before delivering to the scraper), allowing header injection, cookie manipulation, or rate-limit logic.

The target website sees the proxy server's IP address, not your real one. Because Node Unblocker runs on a server you control, you decide what headers to send, what cookies to manipulate, and how to handle redirects.^[3]

Basic Setup: Node Unblocker in 5 Minutes

Step 1 — Initialise the project

Create a new directory, initialise npm, and install dependencies.

mkdir scraper-proxy && cd scraper-proxy
npm init -y
npm install express unblocker

Step 2 — Create the proxy server

Create index.js with the following Express + unblocker setup.

// index.js — basic Node Unblocker proxy server
const express  = require('express');
const Unblocker = require('unblocker');

const app = express();

// Mount unblocker middleware — all /proxy/* requests go through it
const unblocker = new Unblocker({ prefix: '/proxy/' });
app.use(unblocker);

const PORT = process.env.PORT || 8080;
app.listen(PORT, () => {
  console.log(`Proxy running on http://localhost:${PORT}`);
  console.log(`Test: http://localhost:${PORT}/proxy/https://httpbin.org/ip`);
});

// Required for WebSocket support in some environments
app.on('upgrade', unblocker.onUpgrade);

Step 3 — Start and test

Run the server and verify the proxy is working by checking the exit IP.

node index.js

# In a second terminal — should return the server's IP, not yours
curl http://localhost:8080/proxy/https://httpbin.org/ip

The proxy is now running. Any URL prefixed with /proxy/ will be fetched through the server's IP. If the server is a remote VPS or cloud instance, your real IP is hidden from all target websites.

Using Node Unblocker for Web Scraping

With the proxy running, integrate it into any Node.js scraping workflow using fetch or axios:

// scraper.js — fetch through Node Unblocker proxy
const axios = require('axios');

const PROXY_BASE = 'http://localhost:8080/proxy';
const TARGET     = 'https://books.toscrape.com';

async function scrape(url) {
  const proxiedUrl = `${PROXY_BASE}/${url}`;
  const response   = await axios.get(proxiedUrl);
  return response.data;
}

// Fetch book listing through proxy — target sees proxy IP
scrape(TARGET).then(html => {
  console.log('Page length:', html.length);
  // Pass to Cheerio, JSDOM, or any HTML parser
});

All resource requests embedded in the returned HTML (stylesheets, scripts, images) will also use the proxy URL prefix automatically — Node Unblocker's URL rewriting handles this without any additional configuration.

Middleware: Customising Requests and Responses

Node Unblocker's most powerful feature for scraping is its middleware system. Middleware functions fire on every request (before forwarding to the target) and every response (before delivering to the scraper), giving you full control over headers, cookies, and content.^[4]

const Unblocker = require('unblocker');

const unblocker = new Unblocker({
  prefix: '/proxy/',

  // REQUEST MIDDLEWARE — fires before forwarding to target
  requestMiddleware: [
    function setHeaders(data) {
      // Inject realistic browser headers
      data.clientRequest.headers['user-agent'] =
        'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 Chrome/124.0 Safari/537.36';
      data.clientRequest.headers['accept-language'] = 'en-US,en;q=0.9';
      data.clientRequest.headers['accept'] =
        'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8';
      // Remove headers that reveal proxy usage
      delete data.clientRequest.headers['x-forwarded-for'];
      delete data.clientRequest.headers['via'];
    }
  ],

  // RESPONSE MIDDLEWARE — fires before returning to scraper
  responseMiddleware: [
    function logStatus(data) {
      console.log(
        `[${new Date().toISOString()}]`,
        data.clientResponse.statusCode,
        data.remoteRequest.href
      );
    }
  ]
});

Common middleware use cases for scraping:

User-Agent rotation — rotate through a list of realistic browser UA strings per request to avoid UA-based fingerprinting.
Referrer injection — set the Referer header to simulate organic navigation from a search engine.
Cookie management — persist session cookies across requests to maintain login state.
Rate limiting — add request throttling logic to avoid triggering rate limit thresholds on target servers.
X-Forwarded-For stripping — remove headers that reveal the original client IP or proxy usage.

⚠️ Middleware runs synchronously. For async operations like database lookups, rate limit checks, or external API calls, implement them at the Express application level before the unblocker middleware — not inside the unblocker middleware functions themselves, which do not support async/await.

IP Rotation Strategy

A single Node Unblocker instance has one IP — the server it runs on. This IP will be flagged and blocked by any serious target after a moderate number of requests. The standard solution is to deploy multiple Node Unblocker instances across different cloud servers and implement a rotation pool in your scraper:^[3]

// Proxy pool rotation — multiple Node Unblocker instances
const proxyPool = [
  'http://server1.example.com:8080/proxy',
  'http://server2.example.com:8080/proxy',
  'http://server3.example.com:8080/proxy',
];

function getRandomProxy() {
  return proxyPool[Math.floor(Math.random() * proxyPool.length)];
}

async function scrapeWithRotation(targetUrl) {
  const proxy = getRandomProxy();
  const proxiedUrl = `${proxy}/${targetUrl}`;
  try {
    const res = await axios.get(proxiedUrl, { timeout: 15000 });
    return res.data;
  } catch (err) {
    // Remove failed proxy from pool or log for review
    console.error(`Proxy failed: ${proxy}`, err.message);
    throw err;
  }
}

Each Node Unblocker instance deployed on a separate VPS or cloud instance (DigitalOcean droplets from $4/month, Railway from $5/month) gives you a different datacenter IP. This works for targets without aggressive anti-bot systems but reaches its limits quickly on protected sites — datacenter IPs from major cloud providers are flagged by ASN at the IP reputation layer before a single request is processed.

Puppeteer Integration

For JavaScript-heavy targets where Node Unblocker alone is insufficient (SPAs, AJAX-loaded content), combine it with Puppeteer by routing browser traffic through the proxy:

const puppeteer = require('puppeteer');

async function scrapeWithPuppeteer(targetUrl) {
  const browser = await puppeteer.launch({
    // Route all browser traffic through Node Unblocker
    args: ['--proxy-server=http://localhost:8080'],
    headless: 'new'
  });

  const page = await browser.newPage();

  // Set realistic viewport and user-agent
  await page.setViewport({ width: 1366, height: 768 });
  await page.setUserAgent(
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 Chrome/124.0 Safari/537.36'
  );

  // Navigate via the proxy prefix URL
  await page.goto(`http://localhost:8080/proxy/${targetUrl}`, {
    waitUntil: 'networkidle2',
    timeout: 30000
  });

  const html = await page.content();
  await browser.close();
  return html;
}

Pairing Node Unblocker with Residential Proxies

The single most effective upgrade to any Node Unblocker setup is placing a commercial residential proxy behind it — routing Node Unblocker's outbound requests through a residential IP pool instead of the server's datacenter IP. This combines Node Unblocker's URL rewriting and middleware capabilities with the trust level of a genuine consumer ISP connection.

Configure the outbound proxy in Node Unblocker using the httpsAgent option with a Nstproxy residential proxy:

const express      = require('express');
const Unblocker    = require('unblocker');
const { HttpsProxyAgent } = require('https-proxy-agent');

const app = express();

// Nstproxy residential proxy credentials
const NSTPROXY_HOST = 'gate.nstproxy.io';
const NSTPROXY_PORT = '8080';
const NSTPROXY_USER = 'your_username';
const NSTPROXY_PASS = 'your_password';

const proxyUrl   = `http://${NSTPROXY_USER}:${NSTPROXY_PASS}@${NSTPROXY_HOST}:${NSTPROXY_PORT}`;
const proxyAgent = new HttpsProxyAgent(proxyUrl);

const unblocker = new Unblocker({
  prefix: '/proxy/',
  // Route all outbound requests through residential proxy
  httpsAgent: proxyAgent,
  httpAgent:  proxyAgent,
  requestMiddleware: [
    function removeProxyHeaders(data) {
      delete data.clientRequest.headers['x-forwarded-for'];
      delete data.clientRequest.headers['via'];
    }
  ]
});

app.use(unblocker);
app.listen(8080, () => console.log('Proxy with residential IPs running on :8080'));

With this setup, every request forwarded by Node Unblocker exits through a clean Nstproxy residential IP — making it appear as genuine household traffic to the target website while retaining all of Node Unblocker's URL rewriting and middleware capabilities.

Node Unblocker vs Managed Proxy Solutions

Factor	Node Unblocker (Self-Hosted)	Commercial Residential Proxy (e.g. Nstproxy)
IP pool size	1 per server instance — datacenter IP	110M+ residential IPs, rotating automatically
IP trust level	Low — datacenter ASN, easily flagged	Very high — ISP-assigned residential IPs
Infrastructure management	You manage: servers, updates, monitoring, failover	Provider manages everything — zero admin overhead
URL rewriting	Yes — built-in for HTML/CSS/JS	No — raw proxy forwarding only
Middleware customisation	Yes — full request/response hooks	No — configure at the application level
Anti-bot bypass	Limited — no TLS fingerprint spoofing	Partial — residential IPs pass IP reputation checks
Cost	VPS cost ($4–$10/server) + engineering time	From $0.4/GB — no infrastructure overhead
Best for	Learning, small projects, custom proxy logic	Production scraping, geo-targeting, scale

Comparison based on ScrapingBee's Node Unblocker analysis and IPRoyal's 2026 Node Unblocker tutorial.

Node Unblocker Limitations in 2026

🚫 Single Datacenter IP

One server, one IP. Any protected target will block it quickly. Scaling requires deploying multiple VPS instances — significant infrastructure and management overhead versus a commercial proxy pool.

🚫 No TLS Fingerprint Spoofing

Cloudflare, Akamai, and DataDome fingerprint the TLS handshake. Node Unblocker's outbound TLS is Node.js's default — identifiable as a non-browser client by sophisticated anti-bot systems regardless of IP quality.

🚫 Limited SPA Support

Single-page applications built with React, Next.js, or Vue rely heavily on client-side JavaScript execution. Node Unblocker fetches the server response but does not execute JavaScript — dynamically loaded content remains inaccessible without Puppeteer/Playwright.

🚫 postMessage and OAuth Incompatibility

Sites using the postMessage() API for cross-origin communication (OAuth flows, many modern login systems) do not work correctly through Node Unblocker's URL rewriting model.

🚫 Maintenance Burden

Self-hosted proxies require ongoing updates, monitoring for server health, handling IP bans, rotating VPS IPs, and debugging proxy-specific failures — all engineering time that adds up quickly for production workloads.

🚫 No CAPTCHA Handling

Node Unblocker has no built-in CAPTCHA solving capability. For targets deploying reCAPTCHA or hCaptcha, a separate solving service must be integrated at the application layer.

Pros & Cons Snapshot

✅ When Node Unblocker Works Well

Learning proxy concepts hands-on
Small projects scraping static or basic AJAX sites
Custom URL rewriting logic not available in standard proxies
Full control over request/response middleware pipeline
WebSocket support for real-time data sources
Free and open-source — zero licensing cost

❌ Where Node Unblocker Falls Short

Single datacenter IP gets blocked quickly on protected targets
No TLS fingerprint evasion — fails against Cloudflare Bot Management
JavaScript-rendered content requires Puppeteer addition
OAuth and postMessage-based sites are incompatible
Significant infrastructure and maintenance overhead at scale
No built-in CAPTCHA solving or session management

Upgrade Node Unblocker with Clean Residential IPs

Pair Node Unblocker's middleware flexibility with Nstproxy's 110M+ residential IP pool — route outbound requests through clean ISP-assigned IPs that pass anti-bot reputation checks automatically.

Try Nstproxy for Free →

FAQ

Q: What is Node Unblocker?

Node Unblocker is an open-source npm package (unblocker) that creates an HTTP/HTTPS proxy server using Node.js and Express. It intercepts outbound requests from your scraper, fetches target pages on your behalf using the server's IP, rewrites all internal URLs to maintain the proxy routing, and streams responses back. It was originally designed for bypassing internet censorship but is widely used as a programmable proxy layer in Node.js scraping workflows.

Q: Does Node Unblocker work in 2026?

Yes — for simpler, static, or basic AJAX-based sites. For targets protected by Cloudflare Bot Management, Akamai, or DataDome, Node Unblocker alone is insufficient because it cannot spoof TLS fingerprints and runs on a single datacenter IP that is easily flagged. For these targets, pair Node Unblocker with a residential proxy (to address the IP layer) and Puppeteer/Playwright (to address the JavaScript execution layer).

Q: How do I use Node Unblocker with a residential proxy?

Install https-proxy-agent and pass a HttpsProxyAgent instance configured with your residential proxy credentials as the httpsAgent and httpAgent options in the Unblocker constructor. This routes all of Node Unblocker's outbound requests through the residential IP, while your scraper still interacts with Node Unblocker's REST API proxy interface locally.

Q: What is Node Unblocker middleware used for in scraping?

Middleware hooks in Node Unblocker fire on every request (before forwarding to the target) and every response (before delivering to the scraper). For scraping, common uses are: injecting realistic browser headers (User-Agent, Accept-Language), removing proxy-identifying headers (X-Forwarded-For, Via), logging request status for monitoring, and managing cookies across sessions.

Q: What is the difference between Node Unblocker and a standard HTTP proxy?

A standard HTTP proxy requires you to configure proxy settings in your HTTP client and simply forwards raw requests. Node Unblocker exposes a REST API — you append the target URL to a /proxy/ prefix in your request URL — and actively rewrites all internal links in the proxied HTML/CSS/JS to maintain the proxy chain for subsequent resource requests. It also provides request/response middleware hooks that standard proxies do not.