Connecting your website to Agent Traffic using the Custom API

Overview

The Agent Traffic tool in Scrunch lets you monitor how much access your site is getting from LLM bots—including ChatGPT, Perplexity, Gemini, Grok, and others.

If your website is hosted on a platform not natively supported by Scrunch (such as a custom server, internal proxy, legacy CDN, or a bespoke infrastructure setup), you can use the Custom API to send traffic logs directly to Scrunch. This gives you full control: you decide what to capture, when to send it, and how to integrate it into your existing stack.

The Custom API accepts traffic events in JSON or NDJSON (newline-delimited JSON) format and automatically classifies bots based on user agent strings—no extra configuration needed on your end.

What You'll See

Once your Custom API integration is connected, the Agent Traffic dashboard will show:

Total Bot Traffic in the last period
Bot traffic over time
Traffic distribution between Retrieval, Indexer, and Training LLM Bots
Comparison between the current period and the last period (%)
Top bot agents and when they were last seen
Top content pages accessed by LLM bots
Recent bot requests
A date filter to see data from the last 24 hours, last 7 days, or last 30 days

Scrunch AI's Agent Traffic feature allows customers to granularly track which AI platforms are consuming their content (and for what purpose) to enable better understanding of how their content:

will be surfaced in AI platforms like ChatGPT
drives AI responses to relevant questions
and ultimately how it influences AI to describe and recommend their brand, products and services and click through to their site(s).

Adding Your Website

1. Open the Scrunch app.

2. Navigate to the Agent Traffic menu.

3. You'll see the list of websites already connected to Agent Traffic.

4. Click + Connect Site at the top.

5. Select API as your platform.

6. A dedicated instructions page will appear, showing your Site ID, Webhook URL, and API Key.

ℹ️ Each site has its own endpoint and key. Don't reuse them across different sites or integrations.

Integrating via the Custom API

Step 1: Locate your credentials

From the instructions page in Scrunch, copy your:

Site ID — a unique identifier for your site (ULID format)
Webhook URL — https://webhooks.scrunchai.com/v1/sites/<SITE_ID>/platforms/custom/web-traffic
API Key — your authentication token (JWT)

You'll use these in every request you send.

Step 2: Instrument your traffic

In your server, proxy, or logging pipeline, capture the following fields for each incoming HTTP request and its response:

Field	Type	Required	Description
`domain`	string	Yes	The domain of the site (e.g. `example.com`).
`user_agent`	string	Yes	The full User-Agent string of the request.
`url`	string	Yes	The full URL that was requested (e.g. `https://example.com/page`).
`path`	string	Yes	The URL path (e.g. `/page`).
`method`	string	Yes	The HTTP method (e.g. `GET`, `POST`).
`status_code`	integer	Yes	The HTTP response status code (e.g. `200`, `404`).
`timestamp`	integer / float	Yes	Unix epoch timestamp in seconds (e.g. `1700000000`).
`response_time`	integer	No	Response time in milliseconds.
`ip`	string	No	The IP address of the client making the request.

⚠️ Omitting required fields will cause the event to fail validation (422). Always pass the original User-Agent string—Scrunch uses it to automatically identify and classify the bot.

ℹ️ For the full API reference, including all parameters, response codes, and schema details, see the Custom Web Traffic API reference.

Step 3: Send traffic events to Scrunch

You can send events one at a time (JSON) or in batches (NDJSON).

Single event

Use Content-Type: application/json and send one JSON object per request.

curl -X POST "https://webhooks.scrunchai.com/v1/sites/{site_id}/platforms/custom/web-traffic" \   -H "Content-Type: application/json" \   -H "X-Api-Key: YOUR_API_KEY" \   -d '{     "domain": "example.com",     "user_agent": "Mozilla/5.0 (compatible; GPTBot/1.0; +https://openai.com/gptbot)",     "url": "https://example.com/blog/post",     "path": "/blog/post",     "method": "GET",     "status_code": 200,     "timestamp": 1700000000,     "response_time": 120,     "ip": "203.0.113.1"   }'

Batch events (NDJSON)

Use Content-Type: application/x-ndjson and send multiple events, one JSON object per line. This is the recommended approach for high-traffic environments.

curl -X POST "https://webhooks.scrunchai.com/v1/sites/{site_id}/platforms/custom/web-traffic" \   -H "Content-Type: application/x-ndjson" \   -H "X-Api-Key: YOUR_API_KEY" \   -d '{"domain":"example.com","user_agent":"Mozilla/5.0 (compatible; GPTBot/1.0)","url":"https://example.com/page-1","path":"/page-1","method":"GET","status_code":200,"timestamp":1700000000} {"domain":"example.com","user_agent":"Mozilla/5.0 (compatible; ClaudeBot/1.0)","url":"https://example.com/page-2","path":"/page-2","method":"GET","status_code":200,"timestamp":1700000060,"response_time":95,"ip":"198.51.100.42"}'

A successful request returns:

{   "status": "ok" }

ℹ️ For the full API reference, including all parameters, response codes, and schema details, see the Custom Web Traffic API reference.

Step 4: Verify your integration

Wait up to 5 minutes for your site to show as Active in Scrunch. If you don't see traffic, test with a known bot User-Agent string:

curl -X POST "https://webhooks.scrunchai.com/v1/sites/{site_id}/platforms/custom/web-traffic" \   -H "Content-Type: application/json" \   -H "X-Api-Key: YOUR_API_KEY" \   -d '{     "domain": "yourdomain.com",     "user_agent": "Mozilla/5.0 (compatible; GPTBot/1.0; +https://openai.com/gptbot)",     "url": "https://yourdomain.com/test-page",     "path": "/test-page",     "method": "GET",     "status_code": 200,     "timestamp": 1700000000   }'

This will send a sample event to confirm your credentials and pipeline are working.

👉 Once configured, your integration will continuously stream traffic logs to Scrunch, giving you real-time visibility into how LLM bots interact with your content.

Troubleshooting and Tips

Don't see any traffic after integrating?

Ensure the Webhook URL and API Key match exactly what's shown in your Scrunch app.
Check that your Content-Type header matches the body format (application/json for single events, application/x-ndjson for batches).
Confirm your timestamp is a Unix epoch in seconds, not milliseconds.
Confirm you included all required fields (see Step 2).
Wait 5–10 minutes after the first successful request.

Getting a 422 error?

Validate your request body against the schema in Step 2.
Make sure status_code is an integer (not a string like "200").
Make sure timestamp is a number, not an ISO string.

Getting a 429 error?

You're being rate limited. Implement exponential backoff and respect the Retry-After response header.

Tips for better results

Use NDJSON batching to reduce request overhead for high-traffic sites.
Keep batch sizes under 1 MB uncompressed for optimal performance.
Always pass the original, unmodified User-Agent string from the incoming request — Scrunch uses it to classify the bot.
Exclude paths for static assets (CSS, JS, images) if you want cleaner data focused on content pages.
Include paths that serve PDFs — AI bots often request them.
If you manage multiple sites, repeat the process for each site in Scrunch. Never reuse credentials across sites.