Skip to main content

Connecting your website to Agent Traffic using the Custom API

How to send traffic data from any platform to Scrunch's Agent Traffic tool to view all accesses and metrics from LLM bots to your domain using our direct API integration.

Updated this week

Overview

The Agent Traffic tool in Scrunch lets you monitor how much access your site is getting from LLM bots—including ChatGPT, Perplexity, Gemini, Grok, and others.

If your website is hosted on a platform not natively supported by Scrunch (such as a custom server, internal proxy, legacy CDN, or a bespoke infrastructure setup), you can use the Custom API to send traffic logs directly to Scrunch. This gives you full control: you decide what to capture, when to send it, and how to integrate it into your existing stack.

The Custom API accepts traffic events in JSON or NDJSON (newline-delimited JSON) format and automatically classifies bots based on user agent strings—no extra configuration needed on your end.


What You'll See

Once your Custom API integration is connected, the Agent Traffic dashboard will show:

  • Total Bot Traffic in the last period

  • Bot traffic over time

  • Traffic distribution between Retrieval, Indexer, and Training LLM Bots

  • Comparison between the current period and the last period (%)

  • Top bot agents and when they were last seen

  • Top content pages accessed by LLM bots

  • Recent bot requests

  • A date filter to see data from the last 24 hours, last 7 days, or last 30 days

Scrunch AI's Agent Traffic feature allows customers to granularly track which AI platforms are consuming their content (and for what purpose) to enable better understanding of how their content:

  • will be surfaced in AI platforms like ChatGPT

  • drives AI responses to relevant questions

  • and ultimately how it influences AI to describe and recommend their brand, products and services and click through to their site(s).


Adding Your Website

1. Open the Scrunch app.

2. Navigate to the Agent Traffic menu.

3. You'll see the list of websites already connected to Agent Traffic.

4. Click + Connect Site at the top.

5. Select API as your platform.

6. A dedicated instructions page will appear, showing your Site ID, Webhook URL, and API Key.

ℹ️ Each site has its own endpoint and key. Don't reuse them across different sites or integrations.


Integrating via the Custom API

Step 1: Locate your credentials

From the instructions page in Scrunch, copy your:

You'll use these in every request you send.


Step 2: Instrument your traffic

In your server, proxy, or logging pipeline, capture the following fields for each incoming HTTP request and its response:

Field

Type

Required

Description

domain

string

Yes

The domain of the site (e.g. example.com).

user_agent

string

Yes

The full User-Agent string of the request.

url

string

Yes

The full URL that was requested (e.g. https://example.com/page).

path

string

Yes

The URL path (e.g. /page).

method

string

Yes

The HTTP method (e.g. GET, POST).

status_code

integer

Yes

The HTTP response status code (e.g. 200, 404).

timestamp

integer / float

Yes

Unix epoch timestamp in seconds (e.g. 1700000000).

response_time

integer

No

Response time in milliseconds.

ip

string

No

The IP address of the client making the request.

⚠️ Omitting required fields will cause the event to fail validation (422). Always pass the original User-Agent string—Scrunch uses it to automatically identify and classify the bot.

ℹ️ For the full API reference, including all parameters, response codes, and schema details, see the Custom Web Traffic API reference.


Step 3: Send traffic events to Scrunch

You can send events one at a time (JSON) or in batches (NDJSON).

Single event

Use Content-Type: application/json and send one JSON object per request.

curl -X POST "https://webhooks.scrunchai.com/v1/sites/{site_id}/platforms/custom/web-traffic" \   -H "Content-Type: application/json" \   -H "X-Api-Key: YOUR_API_KEY" \   -d '{     "domain": "example.com",     "user_agent": "Mozilla/5.0 (compatible; GPTBot/1.0; +https://openai.com/gptbot)",     "url": "https://example.com/blog/post",     "path": "/blog/post",     "method": "GET",     "status_code": 200,     "timestamp": 1700000000,     "response_time": 120,     "ip": "203.0.113.1"   }'

Batch events (NDJSON)

Use Content-Type: application/x-ndjson and send multiple events, one JSON object per line. This is the recommended approach for high-traffic environments.

curl -X POST "https://webhooks.scrunchai.com/v1/sites/{site_id}/platforms/custom/web-traffic" \   -H "Content-Type: application/x-ndjson" \   -H "X-Api-Key: YOUR_API_KEY" \   -d '{"domain":"example.com","user_agent":"Mozilla/5.0 (compatible; GPTBot/1.0)","url":"https://example.com/page-1","path":"/page-1","method":"GET","status_code":200,"timestamp":1700000000} {"domain":"example.com","user_agent":"Mozilla/5.0 (compatible; ClaudeBot/1.0)","url":"https://example.com/page-2","path":"/page-2","method":"GET","status_code":200,"timestamp":1700000060,"response_time":95,"ip":"198.51.100.42"}'

A successful request returns:

{   "status": "ok" }

ℹ️ For the full API reference, including all parameters, response codes, and schema details, see the Custom Web Traffic API reference.


Step 4: Verify your integration

Wait up to 5 minutes for your site to show as Active in Scrunch. If you don't see traffic, test with a known bot User-Agent string:

curl -X POST "https://webhooks.scrunchai.com/v1/sites/{site_id}/platforms/custom/web-traffic" \   -H "Content-Type: application/json" \   -H "X-Api-Key: YOUR_API_KEY" \   -d '{     "domain": "yourdomain.com",     "user_agent": "Mozilla/5.0 (compatible; GPTBot/1.0; +https://openai.com/gptbot)",     "url": "https://yourdomain.com/test-page",     "path": "/test-page",     "method": "GET",     "status_code": 200,     "timestamp": 1700000000   }'

This will send a sample event to confirm your credentials and pipeline are working.

👉 Once configured, your integration will continuously stream traffic logs to Scrunch, giving you real-time visibility into how LLM bots interact with your content.


Troubleshooting and Tips

Don't see any traffic after integrating?

  • Ensure the Webhook URL and API Key match exactly what's shown in your Scrunch app.

  • Check that your Content-Type header matches the body format (application/json for single events, application/x-ndjson for batches).

  • Confirm your timestamp is a Unix epoch in seconds, not milliseconds.

  • Confirm you included all required fields (see Step 2).

  • Wait 5–10 minutes after the first successful request.

Getting a 422 error?

  • Validate your request body against the schema in Step 2.

  • Make sure status_code is an integer (not a string like "200").

  • Make sure timestamp is a number, not an ISO string.

Getting a 429 error?

  • You're being rate limited. Implement exponential backoff and respect the Retry-After response header.

Tips for better results

  • Use NDJSON batching to reduce request overhead for high-traffic sites.

  • Keep batch sizes under 1 MB uncompressed for optimal performance.

  • Always pass the original, unmodified User-Agent string from the incoming request — Scrunch uses it to classify the bot.

  • Exclude paths for static assets (CSS, JS, images) if you want cleaner data focused on content pages.

  • Include paths that serve PDFs — AI bots often request them.

  • If you manage multiple sites, repeat the process for each site in Scrunch. Never reuse credentials across sites.

Did this answer your question?