The Responses API returns several many-to-many fields (like tags, key_topics, citations, and competitors).
If you flatten these arrays the wrong way in your warehouse or BI tool, you can accidentally create one row per combination (for example, every combination of tag × competitor), which inflates counts and breaks your metrics.
This guide walks through how to model tags so you don’t end up with a row for every combination.
The core issue: many-to-many fields
Each response from the /responses API includes arrays like:
{
"id": 123,
"prompt_id": 456,
"tags": ["Evergreen", "Consideration"],
"competitors": [
{ "name": "Brand A", "present": true },
{ "name": "Brand B", "present": true }
],
"citations": [
{ "url": "https://example.com/one" },
{ "url": "https://example.com/two" }
],
"..."
}If you explode/UNNEST multiple arrays at once, you end up with:
2 tags × 2 competitors × 2 citations
= 8 rows for a single response
Any metric that just does COUNT(*) or SUM() on those rows will be inflated.
Principle #1: Only explode one array at a time
When you need a view “by tag”, only explode tags:
-- Good: one row per response × tag
WITH response_tags AS (
SELECT
r.id AS response_id,
t AS tag
FROM responses r
CROSS JOIN UNNEST(r.tags) AS t
)
SELECT
tag,
COUNT(DISTINCT response_id) AS responses
FROM response_tags
GROUP BY tag;
Avoid exploding tags and competitors in the same SELECT:
-- Avoid: this creates one row per (response × tag × competitor)
SELECT
r.id,
t AS tag,
c.name AS competitor_name
FROM responses r
CROSS JOIN UNNEST(r.tags) AS t
CROSS JOIN UNNEST(r.competitors) AS c;
You can still work with both, but do it in separate steps (see next principle).
Principle #2: Build separate “bridge” tables
A clean pattern is to keep one base fact table plus separate “bridge” tables for each many-to-many field:
responses(one row per response)response_tags(one row per response × tag)response_competitors(one row per response × competitor)response_citations(one row per response × citation), if needed
Example modeling:
-- Base fact: one row per response
CREATE TABLE responses AS
SELECT
id,
created_at,
prompt_id,
platform,
brand_present,
brand_sentiment,
brand_position,
response_text
FROM raw_responses;
-- Bridge: tags
CREATE TABLE response_tags AS
SELECT
r.id AS response_id,
t AS tag
FROM raw_responses r
CROSS JOIN UNNEST(r.tags) AS t;
-- Bridge: competitors
CREATE TABLE response_competitors AS
SELECT
r.id AS response_id,
c.name AS competitor_name,
c.present,
c.position,
c.sentiment
FROM raw_responses r
CROSS JOIN UNNEST(r.competitors) AS c;
From there:
Use
responsesfor most metrics (brand presence, sentiment, etc.)Join to one bridge table at a time depending on the slice you need
Example: brand presence by tag:
WITH base AS (
SELECT
r.id,
r.brand_present
FROM responses r
),
by_tag AS (
SELECT
t.tag,
COUNT(DISTINCT b.id) AS responses,
AVG(CASE WHEN b.brand_present THEN 1 ELSE 0 END) AS brand_presence_pct
FROM base b
JOIN response_tags t
ON t.response_id = b.id
GROUP BY t.tag
)
SELECT *
FROM by_tag;
Notice we:
Aggregate at the response level first, then
Join tags, and
Use
COUNT(DISTINCT)to avoid double counting.
Principle #3: Pre-aggregate before joining bridge tables
If you need a view like brand presence by tag and competitor, don’t explode both arrays at once.
Instead:
Aggregate brand metrics at the response level
Join to tags or competitors at the level you care about
If you truly need “tag × competitor”, be explicit and use
COUNT(DISTINCT response_id)
Example: brand presence by competitor (ignoring tags):
WITH base AS (
SELECT
id,
brand_present
FROM responses
),
by_competitor AS (
SELECT
c.competitor_name,
COUNT(DISTINCT b.id) AS responses,
AVG(CASE WHEN b.brand_present THEN 1 ELSE 0 END) AS brand_presence_pct
FROM base b
JOIN response_competitors c
ON c.response_id = b.id
GROUP BY c.competitor_name
)
SELECT *
FROM by_competitor;
If you absolutely need tag × competitor:
WITH base AS (
SELECT
id,
brand_present
FROM responses
),
by_tag AS (
SELECT
response_id,
tag
FROM response_tags
),
by_competitor AS (
SELECT
response_id,
competitor_name
FROM response_competitors
)
SELECT
t.tag,
c.competitor_name,
COUNT(DISTINCT b.id) AS responses,
AVG(CASE WHEN b.brand_present THEN 1 ELSE 0 END) AS brand_presence_pct
FROM base b
JOIN by_tag t
ON t.response_id = b.id
JOIN by_competitor c
ON c.response_id = b.id
GROUP BY
t.tag,
c.competitor_name;
The key is COUNT(DISTINCT b.id) rather than COUNT(*).
Principle #4: In BI tools, use distinct counts + one exploded field
In tools like Looker Studio, Power BI, or Tableau:
Build one data source that already follows the pattern above (base fact + bridge)
When creating charts:
Only use one exploded dimension (tags or competitors) in a given dataset
Use
COUNT_DISTINCT(response_id)for counts instead of plainCOUNT(*)If you see metrics jump when you add a second dimension, you’re probably hitting the combination problem
Quick checklist
When working with /responses:
✅ Keep a base responses table (one row per response)
✅ Create 1 bridge table per array (response_tags, response_competitors, etc.)
✅ Use COUNT(DISTINCT response_id) when joining bridge tables
✅ In BI tools, only explode one many-to-many dimension per dataset
❌ Don’t explode tags, competitors, and citations in a single step
❌ Don’t sum metrics over the fully exploded combination grid
