Modeling tags from the Responses API (and avoiding row explosion)

The Responses API returns several many-to-many fields (like tags, key_topics, citations, and competitors).

If you flatten these arrays the wrong way in your warehouse or BI tool, you can accidentally create one row per combination (for example, every combination of tag × competitor), which inflates counts and breaks your metrics.

This guide walks through how to model tags so you don’t end up with a row for every combination.

The core issue: many-to-many fields

Each response from the /responses API includes arrays like:

{
  "id": 123,
  "prompt_id": 456,
  "tags": ["Evergreen", "Consideration"],
  "competitors": [
    { "name": "Brand A", "present": true },
    { "name": "Brand B", "present": true }
  ],
  "citations": [
    { "url": "https://example.com/one" },
    { "url": "https://example.com/two" }
  ],
  "..."
}

If you explode/UNNEST multiple arrays at once, you end up with:

2 tags × 2 competitors × 2 citations
= 8 rows for a single response

Any metric that just does COUNT(*) or SUM() on those rows will be inflated.

Principle #1: Only explode one array at a time

When you need a view “by tag”, only explode tags:

-- Good: one row per response × tag
WITH response_tags AS (
  SELECT
    r.id AS response_id,
    t AS tag
  FROM responses r
  CROSS JOIN UNNEST(r.tags) AS t
)
SELECT
  tag,
  COUNT(DISTINCT response_id) AS responses
FROM response_tags
GROUP BY tag;

Avoid exploding tags and competitors in the same SELECT:

-- Avoid: this creates one row per (response × tag × competitor)
SELECT
  r.id,
  t AS tag,
  c.name AS competitor_name
FROM responses r
CROSS JOIN UNNEST(r.tags) AS t
CROSS JOIN UNNEST(r.competitors) AS c;

You can still work with both, but do it in separate steps (see next principle).

Principle #2: Build separate “bridge” tables

A clean pattern is to keep one base fact table plus separate “bridge” tables for each many-to-many field:

responses (one row per response)
response_tags (one row per response × tag)
response_competitors (one row per response × competitor)
response_citations (one row per response × citation), if needed

Example modeling:

-- Base fact: one row per response
CREATE TABLE responses AS
SELECT
  id,
  created_at,
  prompt_id,
  platform,
  brand_present,
  brand_sentiment,
  brand_position,
  response_text
FROM raw_responses;

-- Bridge: tags
CREATE TABLE response_tags AS
SELECT
  r.id AS response_id,
  t AS tag
FROM raw_responses r
CROSS JOIN UNNEST(r.tags) AS t;

-- Bridge: competitors
CREATE TABLE response_competitors AS
SELECT
  r.id AS response_id,
  c.name AS competitor_name,
  c.present,
  c.position,
  c.sentiment
FROM raw_responses r
CROSS JOIN UNNEST(r.competitors) AS c;

From there:

Use responses for most metrics (brand presence, sentiment, etc.)
Join to one bridge table at a time depending on the slice you need

Example: brand presence by tag:

WITH base AS (
  SELECT
    r.id,
    r.brand_present
  FROM responses r
),
by_tag AS (
  SELECT
    t.tag,
    COUNT(DISTINCT b.id) AS responses,
    AVG(CASE WHEN b.brand_present THEN 1 ELSE 0 END) AS brand_presence_pct
  FROM base b
  JOIN response_tags t
    ON t.response_id = b.id
  GROUP BY t.tag
)
SELECT *
FROM by_tag;

Notice we:

Aggregate at the response level first, then
Join tags, and
Use COUNT(DISTINCT) to avoid double counting.

Principle #3: Pre-aggregate before joining bridge tables

If you need a view like brand presence by tag and competitor, don’t explode both arrays at once.

Instead:

Aggregate brand metrics at the response level
Join to tags or competitors at the level you care about
If you truly need “tag × competitor”, be explicit and use COUNT(DISTINCT response_id)

Example: brand presence by competitor (ignoring tags):

WITH base AS (
  SELECT
    id,
    brand_present
  FROM responses
),
by_competitor AS (
  SELECT
    c.competitor_name,
    COUNT(DISTINCT b.id) AS responses,
    AVG(CASE WHEN b.brand_present THEN 1 ELSE 0 END) AS brand_presence_pct
  FROM base b
  JOIN response_competitors c
    ON c.response_id = b.id
  GROUP BY c.competitor_name
)
SELECT *
FROM by_competitor;

If you absolutely need tag × competitor:

WITH base AS (
  SELECT
    id,
    brand_present
  FROM responses
),
by_tag AS (
  SELECT
    response_id,
    tag
  FROM response_tags
),
by_competitor AS (
  SELECT
    response_id,
    competitor_name
  FROM response_competitors
)
SELECT
  t.tag,
  c.competitor_name,
  COUNT(DISTINCT b.id) AS responses,
  AVG(CASE WHEN b.brand_present THEN 1 ELSE 0 END) AS brand_presence_pct
FROM base b
JOIN by_tag t
  ON t.response_id = b.id
JOIN by_competitor c
  ON c.response_id = b.id
GROUP BY
  t.tag,
  c.competitor_name;

The key is COUNT(DISTINCT b.id) rather than COUNT(*).

Principle #4: In BI tools, use distinct counts + one exploded field

In tools like Looker Studio, Power BI, or Tableau:

Build one data source that already follows the pattern above (base fact + bridge)
When creating charts:
- Only use one exploded dimension (tags or competitors) in a given dataset
- Use COUNT_DISTINCT(response_id) for counts instead of plain COUNT(*)
- If you see metrics jump when you add a second dimension, you’re probably hitting the combination problem

Quick checklist

When working with /responses:

✅ Keep a base responses table (one row per response)

✅ Create 1 bridge table per array (response_tags, response_competitors, etc.)

✅ Use COUNT(DISTINCT response_id) when joining bridge tables

✅ In BI tools, only explode one many-to-many dimension per dataset

❌ Don’t explode tags, competitors, and citations in a single step

❌ Don’t sum metrics over the fully exploded combination grid

Integrating Scrunch Data into Adobe Analytics

Scrunch Metrics Guide