Web Scraping API
Structured JSON from any URL in one request. The same AI extraction as the app — no selectors, no maintenance — behind a plain REST API. Included on Pro plans and above; keys live in Settings → API keys.
Two details worth knowing before you benchmark: repeat scrapes of recently fetched pages come back near-instantly, and the fetching layer underneath is independently audited (SOC 2 Type II).
Authentication
Pass your key as a bearer token. Keys are shown once at creation and stored hashed.
curl https://websitescraper.io/api/v1/scrape \
-H "Authorization: Bearer ws_live_..." \
-H "Content-Type: application/json" \
-d '{"url": "https://books.toscrape.com/"}'Scrape a page
POST /api/v1/scrape — body {url, prompt?, schema?}. Add a prompt to steer extraction, or a column schema to lock the output shape. One credit per page; failed jobs are refunded automatically.
curl https://websitescraper.io/api/v1/scrape \
-H "Authorization: Bearer ws_live_..." \
-H "Content-Type: application/json" \
-d '{
"url": "https://books.toscrape.com/",
"prompt": "book name and price",
"schema": {"columns": [{"name": "book_name"}, {"name": "price"}]}
}'
# 200 OK
{
"run_id": "d3f6…",
"columns": ["book_name", "price"],
"rows": [{"book_name": "A Light in the Attic", "price": "£51.77"}, …],
"row_count": 20,
"credits_charged": 1
}Check a job
GET /api/v1/jobs/:id — status and results for any run, including multi-page crawls started in the app.
curl https://websitescraper.io/api/v1/jobs/RUN_ID \ -H "Authorization: Bearer ws_live_..."
Run a saved scraper
Configure a scraper once in the dashboard, then trigger it from your pipeline.
# trigger curl -X POST https://websitescraper.io/api/v1/scrapers/SCRAPER_ID/run \ -H "Authorization: Bearer ws_live_..." # history curl https://websitescraper.io/api/v1/scrapers/SCRAPER_ID/runs \ -H "Authorization: Bearer ws_live_..."
Errors
Errors are typed and human-readable: {"error": {"code", "message"}}.
| Code | HTTP | Meaning |
|---|---|---|
| INVALID_URL | 400 | Malformed, private, or unreachable URL |
| INVALID_KEY | 401 | Missing or revoked API key |
| INSUFFICIENT_CREDITS | 402 | Balance too low — buy a pack or upgrade |
| FETCH_BLOCKED | 403 | Site is on our do-not-scrape list |
| EXTRACTION_EMPTY | 422 | Page had no extractable data (refunded) |
| RATE_LIMITED | 429 | Over 20 scrapes/min — check X-RateLimit-* headers |
| ENGINE_ERROR | 502 | Upstream failure (refunded) |
Rate limit: 20 scrapes per minute per account, reported via X-RateLimit-Limit / -Remaining / -Reset and Retry-After on 429s. Failed jobs always auto-refund — the pricing page spells out the trust rules.