Skip to content

feat: add scrapfly-webhooks skill#55

Merged
leggetter merged 2 commits into
mainfrom
feat/scrapfly-webhooks
May 11, 2026
Merged

feat: add scrapfly-webhooks skill#55
leggetter merged 2 commits into
mainfrom
feat/scrapfly-webhooks

Conversation

@leggetter
Copy link
Copy Markdown
Collaborator

Summary

Adds a complete scrapfly-webhooks provider skill. Scrapfly is a web-scraping API platform with three products — Scrape API, Extraction API, and Screenshot API — that share a single webhook system. One registered webhook URL receives deliveries from all three; the product is distinguished by an X-Scrapfly-Webhook-Resource-Type header.

What's included

  • skills/scrapfly-webhooks/SKILL.md — entry point with frontmatter
  • skills/scrapfly-webhooks/references/ — overview, setup, verification (raw-body HMAC-SHA256, uppercase + lowercase header support, idempotency-by-job-id recommendation)
  • skills/scrapfly-webhooks/examples/ — Express, Next.js App Router, FastAPI handlers that dispatch on X-Scrapfly-Webhook-Resource-Type (scrape / extraction / screenshot)
  • providers.yaml and README.md integration entries

Notes

  • Paid plan required — webhooks are unavailable on Scrapfly's FREE plan (queue size 0). The skill states this prominently as a prerequisite.
  • One shared webhook system, three products — one registered URL receives all three product deliveries. Each API call references the registered webhook by name via webhook_name=… query param; the destination URL cannot be passed per-call and there is no API for managing webhooks programmatically (dashboard only).
  • Signature verification: HMAC-SHA256 over the raw request body bytes (don't JSON.parse → re-stringify — changes byte sequence). Two headers carry the digest: X-Scrapfly-Webhook-Signature (uppercase hex) and X-Scrapfly-Webhook-Signature-Lowercase (lowercase hex). The skill's verifier accepts either using constant-time comparison.
  • No replay envelope. Scrapfly's signing format doesn't include a timestamp. The skill recommends application-level idempotency keyed on X-Scrapfly-Webhook-Job-Id instead of inventing a t=… window.
  • Secret leak warning: the payload's context.webhook.secret field echoes the signing secret. The skill warns handlers to never log or echo it.
  • Delivery semantics: retry 30s → 1min → 5min → 30min → 1h → 1d; auto-disabled after 100 consecutive failures. Handlers should return 2xx fast and surface errors out-of-band.
  • No SDK construct for verification — Scrapfly doesn't publish a constructEvent-style helper. The skill uses stdlib HMAC (crypto.createHmac in Node, hmac/hashlib in Python) — no third-party HMAC library introduced.

Test plan

  • cd skills/scrapfly-webhooks/examples/express && npm test
  • cd skills/scrapfly-webhooks/examples/nextjs && npm test
  • cd skills/scrapfly-webhooks/examples/fastapi && pytest test_webhook.py -v
  • Verify dispatch on X-Scrapfly-Webhook-Resource-Type across the three products
  • Confirm header names, payload shape, and retry semantics against the live docs (https://scrapfly.io/docs/scrape-api/webhook + sibling extraction/screenshot pages)
  • Locally: npx hookdeck-cli listen 3000 scrapfly --path /webhooks/scrapfly

Generation details

  • Generated via ./scripts/generate-skills.sh generate scrapfly --config providers.yaml --model claude-opus-4-7
  • 2 iterations (initial gen + 1 review fix). The review fix corrected:
    • X-Scrapfly-Webhook-Env enum values (test/live, not production)
    • Payload envelope (context.webhook, context.job) and the secret-leak warning
    • Default vs opt-in crawler events
    • Retry schedule + 100-failure auto-disable behavior
    • Paid-plan prerequisite
    • Scrape URL location in payload (payload.result.url, not the webhook context overlay)

https://claude.ai/code/session_01NNTgQRJss1V7gyzzJ9rjnB


Generated by Claude Code

claude and others added 2 commits May 11, 2026 23:30
Adds a complete provider skill covering Scrapfly's HMAC-SHA256 webhook
signature scheme (uppercase hex over raw body, X-Scrapfly-Webhook-Signature
header), routing by X-Scrapfly-Webhook-Resource-Type (scrape, extraction,
screenshot), and Crawler API lifecycle events. Includes Express, Next.js,
and FastAPI examples with tests that generate real Scrapfly signatures.
…viders.yaml

- Correct X-Scrapfly-Webhook-Env header values (test/live, not production)
- Document actual payload envelope (context.webhook, context.job)
- Warn that payload echoes the signing secret at context.webhook.secret
- Distinguish default vs opt-in crawler events
- Add retry schedule and 100-failure auto-disable behavior
- Note paid-plan requirement (FREE plan has webhook queue size 0)
- Read scrape URL from payload.result.url (not the webhook context overlay)
- Add Scrapfly row to README Provider Skills table
- Add scrapfly entry to providers.yaml (docs URLs, notes, testScenario)

https://claude.ai/code/session_01NNTgQRJss1V7gyzzJ9rjnB
@leggetter leggetter marked this pull request as ready for review May 11, 2026 22:30
@leggetter leggetter force-pushed the feat/scrapfly-webhooks branch from a337f50 to 96f2c4b Compare May 11, 2026 22:30
@leggetter leggetter merged commit e189167 into main May 11, 2026
6 checks passed
@leggetter leggetter deleted the feat/scrapfly-webhooks branch May 11, 2026 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants