Menu
API v1 · Live

ScrapeKit API Reference

Scrape Google Maps business leads and verify emails programmatically. JSON over HTTPS, Bearer authentication, predictable errors, and an async job model with optional webhooks.

Overview

The ScrapeKit API lets you run the same tools available in the dashboard — lead scraping and email verification — directly from your own backend or automation stack.

All requests are JSON over HTTPS. Scrape and bulk-verify operations are asynchronous: you create a job, poll for status, and fetch the result payload when it completes. Single-email verification is synchronous and returns in under a second.

You'll need an active ScrapeKit account with credits. The API uses the same credit balance as the dashboard — 0.015 credits per lead scraped, 0.005 credits per email verified.

Base URL

All API requests go to a single host. There is no sandbox — use small max_leads values to test without burning credits.

https://scrapekit.ai

Authentication

Every request must include an Authorization header with a Bearer token. Requests without a valid token return 401 Unauthorized.

Authorization: Bearer YOUR_API_KEY
Keep keys server-side. Never expose a key in browser code, mobile apps, or client-rendered HTML. Anyone with your key can burn your credits.

API Keys

Manage your keys from the dashboard. Each key is tied to your account and can be rotated or revoked at any time.

  • Generate: go to Dashboard → API and click Create key. You'll see the full key once — copy it immediately.
  • Format: keys start with sk_live_ followed by 40 random characters.
  • Scope: a single key has access to all endpoints. Use separate keys per environment or integration so you can revoke individually.
  • Rotate: deleting a key invalidates it instantly. In-flight jobs continue but no new requests with that key will succeed.

Rate Limits

Rate limits are enforced per API key and vary by endpoint class.

Endpoint classLimitWindow
Job creation and cancellation (POST /api/v1/scrape/jobs, POST /api/v1/verify/jobs, POST /api/v1/jobs/{job_id}/cancel)60 requestsper minute
Job status & results (GET /api/v1/scrape/jobs/…, GET /api/v1/verify/jobs/…)300 requestsper minute
Single email verification (POST /api/v1/verify/email)120 requestsper minute

Every response includes three headers so you can track usage without a probe request:

  • X-RateLimit-Limit — your cap for this endpoint class.
  • X-RateLimit-Remaining — requests left in the current window.
  • X-RateLimit-Reset — Unix timestamp when the window resets.

Exceeding the limit returns 429 Too Many Requests with a Retry-After header in seconds. Back off exponentially on repeated 429s.

Errors

Errors use standard HTTP status codes and a single consistent JSON body. Any 4xx means the request was invalid — retrying the same call will fail the same way. 5xx responses are safe to retry with backoff.

Error response body

Every error — from any endpoint, at any status code — returns this exact shape. The top-level object always has one key: error.

{
  "error": {
    "code": "insufficient_credits",
    "message": "Not enough credits. 0.015 required per lead; balance is 1.20.",
    "request_id": "req_7f2a9c4e1b"
  }
}
FieldDescription
error.code stringMachine-readable code. See table below. Always present.
error.message stringHuman-readable explanation. Safe to display to end users. Always present.
error.request_id stringUnique request identifier. Prefix req_. Also returned as the X-Request-Id response header on every request (success or failure). Include this when contacting support.

HTTP status codes

200Request succeeded.
202Async job accepted and queued. Body contains job_id.
400Malformed request or invalid body.
401Missing or invalid API key.
402Insufficient credits.
403Account suspended.
404Resource (e.g. job) not found.
409Conflicting state (e.g. job is not in a cancellable state, or results requested before completion).
429Rate limit exceeded. Check Retry-After header.
500Server error. Safe to retry with exponential backoff.

Common error codes

CodeHTTPMeaning
missing_authorization401Authorization header missing.
invalid_api_key401Key is malformed, revoked, or unknown.
account_suspended403The owning account has been suspended.
insufficient_credits402Balance too low to run the request. See /api/v1/… credit accounting below.
invalid_query400query, location, or max_leads missing / out of range.
invalid_emails400Bulk verify emails is empty, too large (>100,000), or contains no valid addresses.
invalid_email400Single-email verification received an unparseable address.
invalid_webhook400webhook_url is not an HTTPS URL.
too_many_active_jobs429You already have 5 active scrape jobs. Wait for one to finish.
rate_limited429Endpoint-class rate limit hit. See Retry-After.
job_not_found404The job_id does not exist or belongs to a different account.
job_not_ready409Results requested before status=completed.
job_not_cancellable409Cancel requested on a job that is already completed, failed, or cancelled.
job_create_failed500Internal failure while creating a job. Safe to retry.

Examples

{
  "error": {
    "code": "invalid_api_key",
    "message": "API key is invalid or has been revoked.",
    "request_id": "req_8b3d1fa4c9"
  }
}
{
  "error": {
    "code": "insufficient_credits",
    "message": "Not enough credits. 0.005 required per email.",
    "request_id": "req_5d22ab7e31"
  }
}
{
  "error": {
    "code": "job_not_found",
    "message": "Job not found.",
    "request_id": "req_1c44ef902b"
  }
}
{
  "error": {
    "code": "job_not_ready",
    "message": "Job is running. Wait until completed before requesting results.",
    "request_id": "req_90fe12ab3c"
  }
}
{
  "error": {
    "code": "rate_limited",
    "message": "Rate limit exceeded. Max 60 requests per 60s on this endpoint.",
    "request_id": "req_ac3f7b012d"
  }
}

Credit Accounting

Scraping costs 0.015 credits per lead. Email verification costs 0.005 credits per email. Credits never expire.

When credits are charged

  • Scrape jobs: charged per lead as rows are written to the result CSV. If a job produces 312 leads, you are billed 312 × 0.015 = 4.68 credits.
  • Bulk verify jobs: charged per email for every address that got an actionable verdict (see Verification result schema). unknown results are auto-refunded — you are only billed for emails we classified (safe / role / catch-all / invalid / disabled / inbox_full / disposable).
  • Single email verification: charged 0.005 at response time, with the same unknown-refund rule. The response's credits_used field tells you exactly what was billed.

Creation-time credit check

When you create a job, we check your balance and may clamp the request to what you can afford:

  • Scrape: if max_leads × 0.015 exceeds your balance, the effective max_leads is reduced to floor(balance / 0.015). The response's max_leads reflects the clamped value.
  • Bulk verify: if the input list is larger than you can afford, the excess is dropped before queueing. The response's total reflects what was actually queued.
  • If your balance is zero or you can't afford even one unit of work, the call fails with 402 insufficient_credits.

Terminal-state accounting

Final statusScrape billingVerify billing
completed0.015 × rows produced0.005 × emails with an actionable verdict
failed0.015 × rows already written before the failure. A job that fails before producing any rows costs 0 credits.0.005 × emails completed before the failure.
cancelled0.015 × rows already written. Cancelling mid-run does not refund work already done.0.005 × emails completed before cancel. Pending emails are not charged.
partial (hit max_leads)0.015 × rows up to the cap — not the original estimate.

Every job-status response includes a live credits_used field reflecting billing so far. The completion-time value is the final amount deducted from your balance.

Pagination

In v1, results endpoints do not paginate — they return the full payload inline in a single response.

EndpointBehaviourCeiling
GET /api/v1/scrape/jobs/{id}/resultsReturns every lead inline under results[]. No cursor, offset, or page param.Capped by the job's max_leads (hard max 500,000).
GET /api/v1/verify/jobs/{id}/resultsReturns every per-email verdict inline under results[].Capped by the bulk-verify input size (hard max 100,000).
Handling large jobs. A 100,000-row verify result is a ~10–15 MB JSON response. Stream the body and parse lazily if memory-constrained. For CSV output pass ?format=csv — the response will use Content-Type: text/csv with the same columns described in the result schemas below.

Cursor-based pagination may be introduced in a future API version. If it is, it will be an opt-in query parameter — existing clients that do not pass a cursor will continue to receive the full inline response.

Idempotency

Job creation is not idempotent in v1. An Idempotency-Key header is not supported. Retrying a failed POST /api/v1/scrape/jobs or POST /api/v1/verify/jobs will create a second job and charge credits again.

Patterns to avoid duplicate charges

  • Only retry on network-level failures (DNS errors, connection resets, read timeouts before any response). These mean the server never received the request.
  • Do not retry on 5xx responses from job-creation endpoints without first checking GET /api/v1/{scrape|verify}/jobs-style reconciliation if you keep a client-side request ID. A 5xx may mean the job was already created.
  • Use a short in-flight lock on your side, keyed by the job parameters you intend to submit, so a double-click or queue retry can't fire the same POST twice.
  • Store the X-Request-Id from every response. If support needs to trace a suspected duplicate charge, that's the only way we can do it.

Read-only endpoints (GET /…) are naturally idempotent and safe to retry freely.

Scraping Endpoints

Run async lead-scrape jobs over Google Maps. Each job takes a niche + location and returns structured business contact records.

POST/api/v1/scrape/jobs
Create a scrape job
Queue a new Google-Maps lead-scrape for a niche + location. Returns immediately with a job_id; poll status or receive a webhook on completion.
Request body
FieldDescription
querystringREQUIREDNiche or business type. 1–120 chars. Examples: "dentists", "plumbers", "law firms".
locationstringREQUIREDCity, region, or postal area. 1–120 chars. Examples: "Austin, Texas", "SW1A 1AA, UK".
max_leadsintegerCap on leads returned. 0–500000 (0 = no cap, use full available budget). If you can't afford the cap it is silently clamped to what your balance can cover. Default: no cap.
verify_emailsbooleanIf true, run email verification on every scraped lead and set email_status in the result. Adds 0.005 credits per email verified. Default: false.
webhook_urlstringHTTPS URL to POST the completion event to. Must start with https://. See Webhooks.
Example request body
{
  "query": "dentists",
  "location": "Austin, Texas",
  "max_leads": 500,
  "verify_emails": true,
  "webhook_url": "https: 0 
}
Example request
curl https: 0 
  -H "Authorization: Bearer $SCRAPEKIT_KEY" \
  -H "Content-Type: application/json" \
  -d  1 
const res = await fetch("https: 0 
  method: "POST",
  headers: {
    "Authorization":  1 ,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    query: "dentists",
    location: "Austin, Texas",
    max_leads: 500,
    verify_emails: true,
  }),
});
const job = await res.json();
import os, requests

res = requests.post(
    "https: 0 
    headers={"Authorization": f"Bearer {os.environ[ 1 ]}"},
    json={
        "query": "dentists",
        "location": "Austin, Texas",
        "max_leads": 500,
        "verify_emails": True,
    },
)
job = res.json()
Example response
200 OK
{
  "job_id": "sk_job_4f7a9b1c",
  "status": "queued",
  "query": "dentists",
  "location": "Austin, Texas",
  "max_leads": 500,
  "verify_emails": true,
  "estimated_credits": 7.5,
  "created_at": "2026-04-21T10:32:14Z"
}
Response fields
FieldDescription
job_idstringOpaque identifier. Prefix sk_job_. Use for status/results/cancel calls.
statusstringAlways "queued" at creation.
querystringEchoed from the request.
locationstringEchoed from the request.
max_leadsintegerEffective cap, possibly clamped by your available credits.
verify_emailsbooleanEchoed from the request.
estimated_creditsnumbermax_leads × 0.015. Actual billing is per-lead as rows are produced.
created_atstringISO-8601 UTC.
GET/api/v1/scrape/jobs/{job_id}
Retrieve scrape job status
Get the current status and progress of a scrape job. Poll this until status is completed, failed, or cancelled.
Parameters
FieldDescription
job_idstring (path)REQUIREDThe job ID returned from job creation. Prefix sk_job_.
Example request
curl https: 0 
  -H "Authorization: Bearer $SCRAPEKIT_KEY"
const res = await fetch(
  "https: 0 
  { headers: { "Authorization":  1  } },
);
const job = await res.json();
import os, requests
job = requests.get(
    "https: 0 
    headers={"Authorization": f"Bearer {os.environ[ 1 ]}"},
).json()
Example response
200 OK
{
  "job_id": "sk_job_4f7a9b1c",
  "status": "running",
  "query": "dentists",
  "location": "Austin, Texas",
  "progress": {
    "scraped": 214,
    "verified": 198,
    "target": 500
  },
  "credits_used": 3.21,
  "error": null,
  "created_at": "2026-04-21T10:32:14Z",
  "updated_at": "2026-04-21T10:34:51Z"
}
Response fields
FieldDescription
job_idstringOpaque identifier.
statusstringOne of queued, running, completed, failed, cancelled.
querystringEchoed from job creation.
locationstringEchoed from job creation.
progress.scrapedintegerLead rows produced so far.
progress.verifiedintegerLeads with a resolved email (if verify_emails was true). Otherwise 0.
progress.targetintegerTotal businesses discovered for this query (the denominator). 0 before discovery completes.
credits_usednumberCredits billed so far. Equal to 0.015 × progress.scraped.
errorstring | nullNon-null only when status=failed. Human-readable reason.
created_atstringISO-8601 UTC.
updated_atstringISO-8601 UTC. Always the current server time when you read this endpoint.
GET/api/v1/scrape/jobs/{job_id}/results
Fetch scrape job results
Return the completed lead payload. Only valid once status=completed; otherwise returns 409 job_not_ready. Results are not paginated — the full list is returned inline (see Pagination).
Parameters
FieldDescription
job_idstring (path)REQUIREDThe completed job to read.
formatstring (query)json (default) or csv. CSV returns text/csv with exactly the columns described in the result schema.
Example request
curl https: 0 
  -H "Authorization: Bearer $SCRAPEKIT_KEY"
const res = await fetch(
  "https: 0 
  { headers: { "Authorization":  1  } },
);
const { results, results_count } = await res.json();
import os, requests
data = requests.get(
    "https: 0 
    headers={"Authorization": f"Bearer {os.environ[ 1 ]}"},
).json()
leads = data["results"]
Example response
200 OK
{
  "job_id": "sk_job_4f7a9b1c",
  "status": "completed",
  "results_count": 487,
  "credits_used": 7.305,
  "results": [
    {
      "place_id": "ChIJN1t_tDeuEmsRUsoyG83frY4",
      "name": "Smile Austin Dental",
      "address": "1200 Congress Ave, Austin, TX 78701",
      "phone": "+1 512 555 0142",
      "website": "https: 0 
      "rating": 4.8,
      "reviews_count": 312,
      "lat": 30.2672,
      "lng": -97.7431,
      "category": "Dentist",
      "emails_found": "hello@smileaustin.com;admin@smileaustin.com"
    },
    {
      "place_id": "ChIJP3yTRiTtEmsReNnn3KhLp8s",
      "name": "Downtown Dental Group",
      "address": "500 W 5th St, Austin, TX 78701",
      "phone": "+1 512 555 0177",
      "website": "https: 1 
      "rating": 4.6,
      "reviews_count": 208,
      "lat": 30.2681,
      "lng": -97.7465,
      "category": "Dentist",
      "emails_found": "info@downtown-dental.com"
    }
  ]
}
Response fields
FieldDescription
job_idstringEchoed from job creation.
statusstringAlways "completed" when this endpoint returns 200.
results_countintegerExact length of results[].
credits_usednumberFinal credits billed. Equal to 0.015 × results_count.
results[]Lead[]All leads. See Lead schema below. Order is the discovery order — generally most-central-to-query first, but not a stable sort.
Lead object
FieldDescription
place_idstringGoogle Maps Place ID. Always present. Use for deduping across jobs.
namestringBusiness display name. Always present.
addressstringFull formatted address. Always present.
phonestring | emptyLocal or international phone number. Empty string if not listed.
websitestring | emptyBusiness website URL. Empty string if not listed.
ratingnumber | emptyAggregate Google rating (0–5). Empty string if fewer than the minimum Google-required reviews.
reviews_countinteger | emptyNumber of Google reviews. Empty string if unknown.
latnumberLatitude. Always present.
lngnumberLongitude. Always present.
categorystring | emptyGoogle-assigned primary category. Empty string if not known.
emails_foundstringSemicolon-separated list of emails discovered on the business website. Empty string if none found. Present only after the extract phase — in csv mode this column is always present even if empty.

Email Verification Endpoints

Verify a single email in real time, or submit a list as an async bulk job. Both use MX lookup plus SMTP mailbox probing for ~99% accuracy. See the Verify Status Taxonomy for the nine possible verdicts.

POST/api/v1/verify/email
Verify a single email
Synchronous verification of one address. Returns in <1s. unknown results are auto-refunded: credits_used will be 0.
Request body
FieldDescription
emailstringREQUIREDEmail address to verify. 3–254 chars. Must contain @.
Example request body
{
  "email": "hello@smileaustin.com"
}
Example request
curl https: 0 
  -H "Authorization: Bearer $SCRAPEKIT_KEY" \
  -H "Content-Type: application/json" \
  -d  1 
const res = await fetch("https: 0 
  method: "POST",
  headers: {
    "Authorization":  1 ,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ email: "hello@smileaustin.com" }),
});
const result = await res.json();
import os, requests
result = requests.post(
    "https: 0 
    headers={"Authorization": f"Bearer {os.environ[ 1 ]}"},
    json={"email": "hello@smileaustin.com"},
).json()
Example response
200 OK
{
  "email": "hello@smileaustin.com",
  "status": "safe",
  "score": 95,
  "reason": null,
  "checks": {
    "syntax": "pass",
    "mx": "pass",
    "smtp": "pass",
    "disposable": false,
    "role_account": false,
    "catch_all": false,
    "free_provider": false
  },
  "credits_used": 0.005
}
Response fields
FieldDescription
emailstringLowercased input.
statusenumSee Verification status taxonomy below.
scoreintegerOverall confidence 0–95. See scoring table below.
reasonstring | nullShort reason code or message for non-deliverable results (e.g. "mailbox not found"). null for deliverable results.
checks.syntaxstringpass | fail. Address parsed successfully.
checks.mxstringpass | fail. Domain has MX records.
checks.smtpstringpass | timeout | fail. SMTP RCPT outcome.
checks.disposablebooleantrue if the domain is on our 9,000+ disposable-domain list.
checks.role_accountbooleantrue if the local-part matches a role prefix (info@, support@, sales@, admin@, …).
checks.catch_allbooleantrue if the domain accepts mail for any local-part (we probe with a random address).
checks.free_providerbooleantrue if the domain is a consumer email host (Gmail, Outlook, Yahoo, etc.).
credits_usednumber0.005 if status ≠ unknown, else 0 (auto-refunded).
POST/api/v1/verify/jobs
Create a bulk verification job
Submit up to 100,000 emails for async verification. Returns a job_id; poll status or receive a webhook on completion.
Request body
FieldDescription
emailsstring[]REQUIREDArray of email addresses. Max 100,000 per job. Each element must be 3–254 chars and contain @.
webhook_urlstringHTTPS URL to POST the completion event to. Must start with https://.
Example request body
{
  "emails": [
    "a@example.com",
    "b@example.com",
    "c@example.com"
  ],
  "webhook_url": "https: 0 
}
Example request
curl https: 0 
  -H "Authorization: Bearer $SCRAPEKIT_KEY" \
  -H "Content-Type: application/json" \
  -d  1 
const res = await fetch("https: 0 
  method: "POST",
  headers: {
    "Authorization":  1 ,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    emails: ["a@example.com", "b@example.com", "c@example.com"],
    webhook_url: "https: 2 
  }),
});
const job = await res.json();
import os, requests
job = requests.post(
    "https: 0 
    headers={"Authorization": f"Bearer {os.environ[ 1 ]}"},
    json={
        "emails": ["a@example.com", "b@example.com", "c@example.com"],
        "webhook_url": "https: 2 
    },
).json()
Example response
200 OK
{
  "job_id": "sk_verify_c1d2e3f4",
  "status": "queued",
  "total": 12500,
  "estimated_credits": 62.5,
  "created_at": "2026-04-21T10:38:01Z"
}
Response fields
FieldDescription
job_idstringOpaque identifier. Prefix sk_verify_.
statusstringAlways "queued" at creation.
totalintegerNumber of emails that will actually be processed. May be lower than your input length if you don't have enough credits (the excess is silently dropped).
estimated_creditsnumbertotal × 0.005. Actual billing is per-email excluding unknowns.
created_atstringISO-8601 UTC.
GET/api/v1/verify/jobs/{job_id}
Retrieve bulk verify job status
Current status and progress for a bulk verify job.
Parameters
FieldDescription
job_idstring (path)REQUIREDThe job ID returned from /verify/jobs. Prefix sk_verify_.
Example request
curl https: 0 
  -H "Authorization: Bearer $SCRAPEKIT_KEY"
const res = await fetch(
  "https: 0 
  { headers: { "Authorization":  1  } },
);
const job = await res.json();
import os, requests
job = requests.get(
    "https: 0 
    headers={"Authorization": f"Bearer {os.environ[ 1 ]}"},
).json()
Example response
200 OK
{
  "job_id": "sk_verify_c1d2e3f4",
  "status": "running",
  "total": 12500,
  "processed": 7842,
  "valid": 6310,
  "credits_used": 39.21,
  "error": null,
  "created_at": "2026-04-21T10:38:01Z",
  "updated_at": "2026-04-21T10:42:17Z"
}
Response fields
FieldDescription
job_idstringEchoed from job creation.
statusstringqueued | running | completed | failed | cancelled.
totalintegerNumber of emails in this job.
processedintegerEmails processed so far (all statuses, including unknown).
validintegerEmails classified as deliverable so far (safe + role + legacy valid). Catch-all and other statuses are not counted here — see the richer bucketing on the results endpoint.
credits_usednumberCredits billed so far. Equal to 0.005 × processed (unknowns are auto-refunded at completion, so this number may drop slightly when the job finishes).
errorstring | nullNon-null only when status=failed.
created_atstringISO-8601 UTC.
updated_atstringISO-8601 UTC. Current server time at read.
GET/api/v1/verify/jobs/{job_id}/results
Fetch bulk verify results
Return the per-email verdict payload. Only valid once status=completed; otherwise returns 409 job_not_ready. Not paginated — full results inlined.
Parameters
FieldDescription
job_idstring (path)REQUIREDThe completed bulk verify job.
formatstring (query)json (default) or csv. CSV columns are exactly the result fields below.
Example request
curl https: 0 
  -H "Authorization: Bearer $SCRAPEKIT_KEY"
const res = await fetch(
  "https: 0 
  { headers: { "Authorization":  1  } },
);
const { summary, results } = await res.json();
import os, requests
data = requests.get(
    "https: 0 
    headers={"Authorization": f"Bearer {os.environ[ 1 ]}"},
).json()
summary, results = data["summary"], data["results"]
Example response
200 OK
{
  "job_id": "sk_verify_c1d2e3f4",
  "status": "completed",
  "summary": {
    "total": 12500,
    "safe": 8120,
    "role": 1544,
    "catch_all": 987,
    "disposable": 112,
    "invalid": 1403,
    "disabled": 58,
    "inbox_full": 34,
    "unknown": 242
  },
  "credits_used": 61.29,
  "results": [
    {
      "email": "a@example.com",
      "status": "safe",
      "reason": null,
      "role": false,
      "catch_all": false,
      "free_provider": true,
      "overall_score": 88
    },
    {
      "email": "b@example.com",
      "status": "invalid",
      "reason": "mailbox not found",
      "role": false,
      "catch_all": false,
      "free_provider": true,
      "overall_score": 0
    },
    {
      "email": "c@example.com",
      "status": "catch_all",
      "reason": null,
      "role": false,
      "catch_all": true,
      "free_provider": false,
      "overall_score": 55
    },
    {
      "email": "d@example.com",
      "status": "role",
      "reason": null,
      "role": true,
      "catch_all": false,
      "free_provider": false,
      "overall_score": 80
    },
    {
      "email": "e@example.com",
      "status": "disposable",
      "reason": "disposable domain",
      "role": false,
      "catch_all": false,
      "free_provider": false,
      "overall_score": 0
    },
    {
      "email": "f@example.com",
      "status": "unknown",
      "reason": "smtp unreachable",
      "role": false,
      "catch_all": false,
      "free_provider": true,
      "overall_score": 15
    }
  ]
}
Response fields
FieldDescription
job_idstringEchoed.
statusstringAlways "completed" when this endpoint returns 200.
summary.totalintegerTotal emails processed.
summary.<bucket>integerPer-status counts. Buckets: safe, role, catch_all, disposable, invalid, disabled, inbox_full, unknown. Sum of buckets = total.
credits_usednumberFinal credits billed. Equal to 0.005 × (total − unknown).
results[]Verdict[]Per-email verdict. See Verdict schema below.
Verdict object
FieldDescription
emailstringLowercased input.
statusenumOne of: safe, role, catch_all, disposable, invalid, disabled, inbox_full, unknown. See Verification status taxonomy below.
reasonstring | nullShort reason for non-deliverable statuses (e.g. "mailbox not found", "smtp unreachable"). null when deliverable.
rolebooleantrue if the local-part is a role prefix.
catch_allbooleantrue if the domain accepts mail for any local-part.
free_providerbooleantrue if the domain is a consumer email host.
overall_scoreinteger0–95. See scoring table in the Verification status taxonomy.

Jobs — Cancellation

A single endpoint cancels either a scrape or a bulk verify job. See Credit Accounting for how credits already consumed are handled.

POST/api/v1/jobs/{job_id}/cancel
Cancel a job
Cancel a scrape or bulk verify job that is queued or running. Returns 409 job_not_cancellable if the job has already reached a terminal state. Credits already consumed for partially-completed work are NOT refunded — see Credit Accounting.
Request body
FieldDescription
job_idstring (path)REQUIREDEither a scrape job ID (sk_job_…) or a bulk verify job ID (sk_verify_…). A single endpoint works for both.
Example request
curl -X POST https: 0 
  -H "Authorization: Bearer $SCRAPEKIT_KEY"
const res = await fetch(
  "https: 0 
  { method: "POST", headers: { "Authorization":  1  } },
);
const r = await res.json();
import os, requests
r = requests.post(
    "https: 0 
    headers={"Authorization": f"Bearer {os.environ[ 1 ]}"},
).json()
Example response
200 OK
{
  "job_id": "sk_job_4f7a9b1c",
  "status": "cancelled",
  "cancelled_at": "2026-04-21T10:36:02Z",
  "note": "Credits already consumed for partially-completed work are not refunded."
}
Response fields
FieldDescription
job_idstringEchoed.
statusstringAlways "cancelled".
cancelled_atstringISO-8601 UTC timestamp at which cancellation was recorded.
notestringHuman-readable refund-policy reminder.

Async Job Flow

Scrape jobs and bulk verify jobs follow the same lifecycle. Pick either polling or webhooks — or use webhooks with polling as a fallback.

StatusTerminal?Meaning
queuedNoAccepted by the API, waiting for a worker.
runningNoWorker is actively processing. This status collapses the internal discovering / scraping / extracting phases into one public value.
completedYesFinished. Call GET /api/v1/scrape/jobs/{id}/results (or the verify equivalent) to fetch the payload.
failedYesTerminal error. The status payload's error field holds a human-readable reason. A job that failed before producing any rows is billed 0 credits.
cancelledYesYou called POST /api/v1/jobs/{id}/cancel. Any rows already produced are preserved and billed.

Recommended polling cadence

  • 0 – 60s: poll every 5 seconds.
  • 1 – 10 min: poll every 15 seconds.
  • 10+ min: poll every 60 seconds, or just wait for the webhook.

A typical scrape job of 500 leads finishes in 2–4 minutes. A bulk verify of 10,000 emails takes 8–15 minutes.

Verification Status Taxonomy

Every verified email is classified into one of nine statuses, each carrying a different deliverability signal. The single-email endpoint and bulk-verify results use the same taxonomy.

StatusScoreBilled?Meaning
safe95 (88 if free provider)YesPersonal mailbox on a non-catch-all domain. SMTP accepted. Deliverable.
role80 (70 if free provider)YesRole-based local-part (info@, support@, sales@, admin@, …) on a non-catch-all domain. Deliverable; lower personal-response rate.
catch_all55 (50 if role)YesDomain accepts mail for any local-part. We can't distinguish real mailboxes from any other string at this domain — treat as lower confidence.
disposable0YesTemporary-email domain (e.g. mailinator.com, guerrillamail.com, 10minutemail.com). Do not send.
invalid0YesMail server responded with a permanent failure: the mailbox does not exist, the address is rejected, or the syntax is unparseable.
disabled5YesThe mail server responded with a message indicating the account is disabled, suspended, or terminated. Do not send.
inbox_full35YesThe mailbox exists but cannot currently receive mail (over quota, full, 552 / 5.2.2). May recover.
unknown15No — refundedSMTP was unreachable, timed out, or greylisted, so we couldn't form a verdict. You are not billed for these.
valid (legacy)85YesBack-compat alias produced by older jobs before the taxonomy split. Equivalent to a non-enriched safe/role. New jobs no longer return this value.

Grouping for deliverability decisions

  • Deliverable: safe, role, legacy valid — these populate valid on the status endpoint.
  • Lower confidence, still deliverable: catch_all, inbox_full — generally safe to send but expect soft bounces or lower reply rates.
  • Do not send: disposable, invalid, disabled.
  • Retry later: unknown — not billed, safe to re-submit.

Webhooks

Pass a webhook_url (HTTPS required) when creating a scrape or bulk verify job and we'll POST a signed JSON payload as soon as the job reaches a terminal state — no polling required.

Event types

EventSent when
scrape.job.completedScrape finished with status=completed.
scrape.job.failedScrape ended with status=failed or cancelled.
verify.job.completedBulk verification finished with status=completed.
verify.job.failedBulk verification ended with status=failed or cancelled.

Payload shape

The body is always an object with three top-level keys: event, created_at, and data. The data object mirrors the fields on the corresponding status endpoint and also includes a results_url. Results are never inlined in the webhook — fetch them from results_url with your API key.

scrape.job.completed

{
  "event": "scrape.job.completed",
  "created_at": "2026-04-21T10:41:02Z",
  "data": {
    "job_id": "sk_job_4f7a9b1c",
    "status": "completed",
    "query": "dentists",
    "location": "Austin, Texas",
    "results_count": 487,
    "results_url": "https: 0 
    "credits_used": 7.305
  }
}

scrape.job.failed

{
  "event": "scrape.job.failed",
  "created_at": "2026-04-21T10:43:11Z",
  "data": {
    "job_id": "sk_job_9eb012fa",
    "status": "failed",
    "query": "unicorn breeders",
    "location": "Reykjavik, Iceland",
    "results_count": 0,
    "results_url": "https: 0 
    "credits_used": 0,
    "error": "No businesses found for this query in the target region."
  }
}

verify.job.completed

{
  "event": "verify.job.completed",
  "created_at": "2026-04-21T10:52:44Z",
  "data": {
    "job_id": "sk_verify_c1d2e3f4",
    "status": "completed",
    "total": 12500,
    "processed": 12500,
    "valid": 9664,
    "results_url": "https: 0 
    "credits_used": 61.29
  }
}

verify.job.failed

{
  "event": "verify.job.failed",
  "created_at": "2026-04-21T10:52:44Z",
  "data": {
    "job_id": "sk_verify_d2e3f4c1",
    "status": "failed",
    "total": 12500,
    "processed": 842,
    "valid": 621,
    "results_url": "https: 0 
    "credits_used": 4.21,
    "error": "Worker crashed mid-run"
  }
}

Headers

HeaderValue
Content-Typeapplication/json
User-AgentScrapeKit-Webhook/1.0
X-ScrapeKit-EventThe event type, e.g. scrape.job.completed.
X-ScrapeKit-TimestampUnix seconds (UTC) at send time. Reject if older than 5 minutes.
X-ScrapeKit-SignatureHex-encoded HMAC-SHA256 of {timestamp}.{raw_body} keyed with your webhook secret.

Signing secret

  • Scope: one secret per account (not per key). All webhooks triggered by any of your API keys are signed with the same secret.
  • Format: prefix whsec_ followed by 43 URL-safe characters.
  • Retrieve / rotate (UI): go to Dashboard → API. A secret is auto-generated the first time you open that panel, and you can rotate it with one click.
  • Retrieve / rotate (HTTP): dashboard-session endpoints — authenticate with your logged-in browser session (JWT cookie), not a sk_live_ API key:
    • GET https://scrapekit.ai/api/webhook/secret{ "webhook_secret": "whsec_..." } (auto-creates one if none exists).
    • POST https://scrapekit.ai/api/webhook/secret/rotate{ "webhook_secret": "whsec_...", "rotated_at": "..." } (replaces the current secret immediately).
  • Rotation behaviour: rotating is immediate. In-flight webhooks already queued for dispatch are signed with whichever secret is current at send time — if you rotate during a burst, receivers may need to accept either the previous or new secret for a short window.

Signature verification

The signed string is {timestamp}.{raw_body} (the literal timestamp string, a period, then the raw request body bytes before any JSON parsing). Signature is the lowercase hex HMAC-SHA256 of that string keyed with your webhook secret. Compare with a constant-time check and reject any request where the signature doesn't match or the timestamp is more than 5 minutes old (replay protection).

import hmac, hashlib, time

SECRET = b"whsec_..."   0 

def verify(raw_body: bytes, signature: str, timestamp: str) -> bool:
     1 
    try:
        ts = int(timestamp)
    except ValueError:
        return False
    if abs(time.time() - ts) > 300:
        return False
     2 
    signed = timestamp.encode() + b"." + raw_body
    expected = hmac.new(SECRET, signed, hashlib.sha256).hexdigest()
     3 
    return hmac.compare_digest(expected, signature)

 4 
from flask import Flask, request, abort
app = Flask(__name__)

@app.post("/webhooks/scrapekit")
def on_webhook():
    raw = request.get_data()
    sig = request.headers.get("X-ScrapeKit-Signature", "")
    ts  = request.headers.get("X-ScrapeKit-Timestamp", "")
    if not verify(raw, sig, ts):
        abort(401)
    event = request.get_json()
     5 
    return ("", 204)
import crypto from "node:crypto";

const SECRET = "whsec_...";  0 

export function verify(rawBody, signature, timestamp) {
   1 
  const ts = Number(timestamp);
  if (!Number.isFinite(ts) || Math.abs(Date.now() / 1000 - ts) > 300) return false;
   2 
  const signed = Buffer.concat([Buffer.from(timestamp), Buffer.from("."), rawBody]);
  const expected = crypto.createHmac("sha256", SECRET).update(signed).digest("hex");
   3 
  const a = Buffer.from(expected), b = Buffer.from(signature);
  return a.length === b.length && crypto.timingSafeEqual(a, b);
}

 4 
import express from "express";
const app = express();
app.post("/webhooks/scrapekit",
  express.raw({ type: "application/json" }),
  (req, res) => {
    const ok = verify(
      req.body,
      req.get("X-ScrapeKit-Signature") || "",
      req.get("X-ScrapeKit-Timestamp") || "",
    );
    if (!ok) return res.sendStatus(401);
    const event = JSON.parse(req.body.toString("utf8"));
     5 
    res.sendStatus(204);
  });

Delivery guarantees

  • At-most-once, 10-second timeout. We POST the payload one time with a 10-second socket timeout. If your endpoint returns non-2xx or times out, the delivery is considered failed and is not retried in v1 — you should treat webhooks as a latency optimisation on top of polling, not the sole signal.
  • Always have a fallback. For safety-critical flows, pair webhooks with a periodic poll of GET /api/v1/{scrape|verify}/jobs/{id} to reconcile any missed deliveries.
  • Respond fast. Return 2xx within 10 seconds. Push heavy work into a queue.

Quickstart

Go from zero to fetched leads in three copy-paste-runnable calls. Uses a small max_leads so you can try without burning credits.

01

Create an API key

Go to Dashboard → API, click Create key, and copy the sk_live_… value once. Export it as SCRAPEKIT_KEY.

02

Submit a scrape job

POST to /api/v1/scrape/jobs with query, location, and a small max_leads. You get back a job_id.

03

Fetch results

Poll /api/v1/scrape/jobs/{job_id} until status=completed, then GET /api/v1/scrape/jobs/{job_id}/results.

 0 
export SCRAPEKIT_KEY="sk_live_..."

 1 
curl https: 2 
  -H "Authorization: Bearer $SCRAPEKIT_KEY" \
  -H "Content-Type: application/json" \
  -d  3 
 4 

 5 
curl https: 6 
  -H "Authorization: Bearer $SCRAPEKIT_KEY"

 7 
curl https: 8 
  -H "Authorization: Bearer $SCRAPEKIT_KEY"
 9 
const BASE = "https: 0 
const headers = {
  "Authorization":  1 ,
  "Content-Type": "application/json",
};

 2 
const create = await fetch( 3 , {
  method: "POST",
  headers,
  body: JSON.stringify({
    query: "plumbers",
    location: "Dublin, Ireland",
    max_leads: 50,
    verify_emails: true,
  }),
});
const { job_id } = await create.json();

 4 
const deadline = Date.now() + 10 * 60 * 1000;
let job;
while (Date.now() < deadline) {
  await new Promise(r => setTimeout(r, 5000));
  job = await fetch( 5 , { headers }).then(r => r.json());
  if (["completed", "failed", "cancelled"].includes(job.status)) break;
}
if (job.status !== "completed") {
  throw new Error( 6 );
}

 7 
const { results } = await fetch(
   8 ,
  { headers },
).then(r => r.json());
console.log( 9 );
import os, time, requests

BASE = "https: 0 
headers = {"Authorization": f"Bearer {os.environ[ 1 ]}"}

 2 
create = requests.post(
    f"{BASE}/scrape/jobs",
    headers=headers,
    json={
        "query": "plumbers",
        "location": "Dublin, Ireland",
        "max_leads": 50,
        "verify_emails": True,
    },
)
create.raise_for_status()
job_id = create.json()["job_id"]

 3 
deadline = time.time() + 10 * 60
while time.time() < deadline:
    time.sleep(5)
    job = requests.get(f"{BASE}/scrape/jobs/{job_id}", headers=headers).json()
    if job["status"] in ("completed", "failed", "cancelled"):
        break
if job["status"] != "completed":
    raise RuntimeError(f"Job ended with status={job[ 4 ]}: {job.get( 5 ) or  6 }")

 7 
results = requests.get(
    f"{BASE}/scrape/jobs/{job_id}/results",
    headers=headers,
).json()["results"]
print(f"Got {len(results)} leads")

Support

We typically reply to API questions within a few business hours.

Every response — success or error — includes an X-Request-Id header (e.g. req_7f2a9c4e1b). Error responses also include the same value as error.request_id in the body. Always paste this ID into support emails — it's the only way we can trace the exact call on our side.

Stuck on an integration?

Email us with the request_id or the X-Request-Id header value and we'll trace it end-to-end. Feature requests and feedback are also welcome.