Python try except is the difference between a scraper that stops after one bad request and a crawler that keeps working through network failures. In production scraping, errors are normal. A server may time out, a proxy may fail, a page may return 403, or a selector may break after a layout change. This guide explains try, except, else, and finally through the lens of high-availability crawlers. It is written for Python developers who already send HTTP requests and now need safer failure handling. You will learn how to catch specific exceptions, retry with backoff, rotate proxies, release resources, and use Nstproxy as part of a stable proxy workflow.
Key Takeaways
Use python try except to handle expected crawler failures without hiding bugs.
Catch specific exceptions such as Timeout, ProxyError, and HTTPError.
Use else for parsing only after a successful request.
Use finally for cleanup, session closing, and metrics.
Pair retry logic with proxy rotation when network failures repeat.
Common Exceptions in Web Scraping
Scrapers fail in patterns, so exception handling should match those patterns. Treat network errors, proxy errors, HTTP status errors, and parsing errors as different events.
The official requests documentation lists exceptions such as Timeout, TooManyRedirects, and HTTPError in its errors and exceptions guide. Python also documents exception handling in Errors and Exceptions.
The key rule is simple. Catch what you can recover from. Let unknown bugs surface during development.
Python Try Except Basics for Crawlers
Python try except should protect only the risky operation. In a crawler, that usually means the request, status check, parser step, or storage step. Keep each block small enough to explain.
import requests
url ="https://example.com/products"try: response = requests.get(url, timeout=10) response.raise_for_status()except requests.exceptions.Timeout:print("Request timed out; schedule a retry.")except requests.exceptions.HTTPError as exc:print(f"HTTP error: {exc.response.status_code}")except requests.exceptions.RequestException as exc:print(f"Network error: {exc}")else: html = response.text
print("Safe to parse the page now.")
This pattern is better than a broad except:. It separates timeouts, HTTP errors, and general request failures. It also keeps parsing in the else block, which runs only after the request succeeds.
Avoid this pattern in production:
try: response = requests.get(url)except:pass
It hides real bugs. It also creates silent data gaps, which are harder to fix than visible errors.
Catch Proxy Errors and Rotate IPs
Proxy errors need their own branch because the fix is different from a normal retry. If a proxy endpoint fails, repeating the same request through the same proxy may waste time. The scraper should mark the proxy as unhealthy and try another one.
import requests
deffetch_with_proxy(url, proxy): proxies ={"http": proxy,"https": proxy,}try: response = requests.get(url, proxies=proxies, timeout=12) response.raise_for_status()except requests.exceptions.ProxyError:return{"ok":False,"reason":"proxy_error","retry":True}except requests.exceptions.Timeout:return{"ok":False,"reason":"timeout","retry":True}except requests.exceptions.HTTPError as exc: status = exc.response.status_code
return{"ok":False,"reason":f"http_{status}","retry": status in(403,407,429)}except requests.exceptions.RequestException as exc:return{"ok":False,"reason":str(exc),"retry":True}else:return{"ok":True,"html": response.text}
This structure makes retry decisions explicit. A ProxyError can trigger proxy replacement. A 429 can trigger slower pacing. A 403 can trigger review of headers, session behavior, or proxy quality.
Nstproxy fits naturally here. If your scraper uses a rotating proxy pool, Nstproxy can provide cleaner proxy sources for retry workflows. Its intelligent IP rotation and global coverage reduce the likelihood of blocks, CAPTCHAs, and rate limits, allowing scrapers to access public information at scale.
The else block is best for work that should run only after no exception occurs. In scraping, put parsing or extraction there. This prevents your parser from running on a missing or failed response.
The finally block is best for cleanup. Use it to close sessions, release browser pages, update metrics, or return a proxy token to a pool. Avoid complex business logic in finally.
session = requests.Session()try: response = session.get(url, timeout=10) response.raise_for_status()except requests.exceptions.RequestException as exc: logger.warning("fetch_failed", extra={"url": url,"error":str(exc)})else: title = parse_title(response.text) save_record(url, title)finally: session.close()
This reads like a crawler lifecycle. Try the request. Handle known failures. Parse only on success. Clean up every time.
The Python documentation also supports re-raising exceptions when the caller should decide what happens next. That is useful when a low-level fetch function should not hide a failure from the job scheduler.
Production Retry Strategy
Production retry logic should be limited, observable, and polite. Infinite retries can overload a target site and waste proxy bandwidth. A safer pattern is exponential backoff with jitter, retry caps, and status-aware decisions.
The urllib3 project documents a Retry utility for handling retry behavior in HTTP clients. See urllib3 Retry for the underlying concepts.
Retry Trigger
Retry?
Extra Action
Timeout
Yes
Increase backoff
ProxyError
Yes
Replace proxy
403
Sometimes
Review headers and proxy reputation
407
Yes
Check proxy authentication
429
Yes
Slow rate and rotate IP
404
No
Record missing page
Parser error
No immediate retry
Log sample HTML
Production code should log each attempt. Include URL, status code, exception type, proxy ID, retry count, and final outcome. These fields help you separate bad proxies from bad parsers.
Nstproxy can reduce network-layer noise by giving crawlers a managed proxy source instead of random free proxies. Teams can also use the free proxy checker during diagnostics and residential proxies for workflows that need realistic network profiles.
Comparison Summary: Simple Try Except vs Production Handling
Simple python try except examples are fine for learning. Production scrapers need more structure because the same exception can mean different actions.
Area
Beginner Pattern
Production Scraper Pattern
Exception type
Catch all errors
Catch specific exceptions
Proxy handling
Retry same request
Replace proxy on proxy failure
HTTP status
Ignore or print
Route by 403, 407, 429, 5xx
Logging
Console output
Structured logs with proxy ID
Retry
Manual loop
Backoff, jitter, max attempts
Parsing
Parse inside try
Parse in else after success
Cleanup
Often skipped
finally closes sessions
Practical Workflow for a High-Availability Scraper
A stable crawler should treat failures as data. Each failed request should update the next decision.
Use this workflow:
Pick a URL and proxy.
Send the request with a timeout.
Catch specific network and proxy exceptions.
Retry with backoff when the error is recoverable.
Rotate proxy on ProxyError, 407, repeated 403, or repeated 429.
Parse only after a clean response.
Log the final outcome.
Store failed URLs for later review.
This workflow improves survival rate without hiding real defects. It also helps teams decide when the issue is the target page, the parser, the request pattern, or the proxy pool.
It lets Python run risky code in a try block and handle expected failures in one or more except blocks. In scrapers, it prevents one failed request from stopping the whole job.
Should I catch Exception in a scraper?
Catch specific exceptions first. Use Exception only at a boundary where you log the error and keep the job alive. Avoid bare except: because it can hide bugs.
How do I handle proxy errors in Python requests?
Catch requests.exceptions.ProxyError, mark the proxy as unhealthy, then retry with a different proxy. Also log the proxy ID, URL, and retry count.
Should parsing code go inside try or else?
Place parsing in else when it depends on a successful request. This keeps failed responses from reaching your parser.
How does Nstproxy help scraper reliability?
Nstproxy provides proxy infrastructure that can support retry and IP rotation workflows. It helps reduce failures caused by low-quality or unstable proxy sources.
Conclusion
Python try except is more than beginner syntax when you build web scrapers. It is the control layer that keeps jobs alive through timeouts, proxy failures, HTTP blocks, and parser changes.
Start small. Catch specific exceptions. Use else for parsing and finally for cleanup. Add capped retries with backoff. Rotate proxies when the failure is network-related. For teams that need steadier proxy infrastructure, Nstproxy is a practical fit for Python scraping workflows.