Overview
The Agent Traffic tool in Scrunch lets you monitor how much access your site is getting from LLM bots—including ChatGPT, Perplexity, Gemini, Grok, and others.
If your website is hosted on a platform not natively supported by Scrunch (such as a custom server, internal proxy, legacy CDN, or a bespoke infrastructure setup), you can use the Custom API to send traffic logs directly to Scrunch. This gives you full control: you decide what to capture, when to send it, and how to integrate it into your existing stack.
The Custom API accepts traffic events in JSON or NDJSON (newline-delimited JSON) format and automatically classifies bots based on user agent strings—no extra configuration needed on your end.
What You'll See
Once your Custom API integration is connected, the Agent Traffic dashboard will show:
Total Bot Traffic in the last period
Bot traffic over time
Traffic distribution between Retrieval, Indexer, and Training LLM Bots
Comparison between the current period and the last period (%)
Top bot agents and when they were last seen
Top content pages accessed by LLM bots
Recent bot requests
A date filter to see data from the last 24 hours, last 7 days, or last 30 days
Scrunch AI's Agent Traffic feature allows customers to granularly track which AI platforms are consuming their content (and for what purpose) to enable better understanding of how their content:
will be surfaced in AI platforms like ChatGPT
drives AI responses to relevant questions
and ultimately how it influences AI to describe and recommend their brand, products and services and click through to their site(s).
Adding Your Website
1. Open the Scrunch app.
2. Navigate to the Agent Traffic menu.
3. You'll see the list of websites already connected to Agent Traffic.
4. Click + Connect Site at the top.
5. Select API as your platform.
6. A dedicated instructions page will appear, showing your Site ID, Webhook URL, and API Key.
ℹ️ Each site has its own endpoint and key. Don't reuse them across different sites or integrations.
Integrating via the Custom API
Step 1: Locate your credentials
From the instructions page in Scrunch, copy your:
Site ID — a unique identifier for your site (ULID format)
API Key — your authentication token (JWT)
You'll use these in every request you send.
Step 2: Instrument your traffic
In your server, proxy, or logging pipeline, capture the following fields for each incoming HTTP request and its response:
Field | Type | Required | Description |
| string | Yes | The domain of the site (e.g. |
| string | Yes | The full User-Agent string of the request. |
| string | Yes | The full URL that was requested (e.g. |
| string | Yes | The URL path (e.g. |
| string | Yes | The HTTP method (e.g. |
| integer | Yes | The HTTP response status code (e.g. |
| integer / float | Yes | Unix epoch timestamp in seconds (e.g. |
| integer | No | Response time in milliseconds. |
| string | No | The IP address of the client making the request. |
⚠️ Omitting required fields will cause the event to fail validation (422). Always pass the original User-Agent string—Scrunch uses it to automatically identify and classify the bot.
ℹ️ For the full API reference, including all parameters, response codes, and schema details, see the Custom Web Traffic API reference.
Step 3: Send traffic events to Scrunch
You can send events one at a time (JSON) or in batches (NDJSON).
Single event
Use Content-Type: application/json and send one JSON object per request.
curl -X POST "https://webhooks.scrunchai.com/v1/sites/{site_id}/platforms/custom/web-traffic" \ -H "Content-Type: application/json" \ -H "X-Api-Key: YOUR_API_KEY" \ -d '{ "domain": "example.com", "user_agent": "Mozilla/5.0 (compatible; GPTBot/1.0; +https://openai.com/gptbot)", "url": "https://example.com/blog/post", "path": "/blog/post", "method": "GET", "status_code": 200, "timestamp": 1700000000, "response_time": 120, "ip": "203.0.113.1" }'Batch events (NDJSON)
Use Content-Type: application/x-ndjson and send multiple events, one JSON object per line. This is the recommended approach for high-traffic environments.
curl -X POST "https://webhooks.scrunchai.com/v1/sites/{site_id}/platforms/custom/web-traffic" \ -H "Content-Type: application/x-ndjson" \ -H "X-Api-Key: YOUR_API_KEY" \ -d '{"domain":"example.com","user_agent":"Mozilla/5.0 (compatible; GPTBot/1.0)","url":"https://example.com/page-1","path":"/page-1","method":"GET","status_code":200,"timestamp":1700000000} {"domain":"example.com","user_agent":"Mozilla/5.0 (compatible; ClaudeBot/1.0)","url":"https://example.com/page-2","path":"/page-2","method":"GET","status_code":200,"timestamp":1700000060,"response_time":95,"ip":"198.51.100.42"}'
A successful request returns:
{ "status": "ok" }ℹ️ For the full API reference, including all parameters, response codes, and schema details, see the Custom Web Traffic API reference.
Step 4: Verify your integration
Wait up to 5 minutes for your site to show as Active in Scrunch. If you don't see traffic, test with a known bot User-Agent string:
curl -X POST "https://webhooks.scrunchai.com/v1/sites/{site_id}/platforms/custom/web-traffic" \ -H "Content-Type: application/json" \ -H "X-Api-Key: YOUR_API_KEY" \ -d '{ "domain": "yourdomain.com", "user_agent": "Mozilla/5.0 (compatible; GPTBot/1.0; +https://openai.com/gptbot)", "url": "https://yourdomain.com/test-page", "path": "/test-page", "method": "GET", "status_code": 200, "timestamp": 1700000000 }'This will send a sample event to confirm your credentials and pipeline are working.
👉 Once configured, your integration will continuously stream traffic logs to Scrunch, giving you real-time visibility into how LLM bots interact with your content.
Troubleshooting and Tips
Don't see any traffic after integrating?
Ensure the Webhook URL and API Key match exactly what's shown in your Scrunch app.
Check that your
Content-Typeheader matches the body format (application/jsonfor single events,application/x-ndjsonfor batches).Confirm your
timestampis a Unix epoch in seconds, not milliseconds.Confirm you included all required fields (see Step 2).
Wait 5–10 minutes after the first successful request.
Getting a 422 error?
Validate your request body against the schema in Step 2.
Make sure
status_codeis an integer (not a string like"200").Make sure
timestampis a number, not an ISO string.
Getting a 429 error?
You're being rate limited. Implement exponential backoff and respect the
Retry-Afterresponse header.
Tips for better results
Use NDJSON batching to reduce request overhead for high-traffic sites.
Keep batch sizes under 1 MB uncompressed for optimal performance.
Always pass the original, unmodified User-Agent string from the incoming request — Scrunch uses it to classify the bot.
Exclude paths for static assets (CSS, JS, images) if you want cleaner data focused on content pages.
Include paths that serve PDFs — AI bots often request them.
If you manage multiple sites, repeat the process for each site in Scrunch. Never reuse credentials across sites.
