Home / Blog / Grapes don't have a table
May 11, 2026Β· Wine World Map
Grapes don't have a table
If you visit /grapes you see a list of 80-ish wine grape varieties,
each with a count of how many regions grow it, a list of countries,
and three sample regions. It looks like a join across a grapes
table.
There is no grapes table.
Every row in regions has a grapes column of type text[] β a
Postgres array of strings, free-form, populated during the seed
scripts from whatever Wikipedia / WSET-style sources we had handy:
SELECT name, grapes FROM regions WHERE name = 'Tuscany';
--
-- Tuscany | {"Sangiovese","Vernaccia","Trebbiano","Vermentino","Canaiolo"}
When you load /grapes, the page reads every region's grapes array,
flattens them into a map keyed by slug of the grape name, and counts
distinct regions per slug. That's the whole "grapes table" β built
fresh on each request from the regions table.
Why no table?
We considered it. The relational version would be:
CREATE TABLE grapes (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
slug TEXT NOT NULL UNIQUE
);
CREATE TABLE region_grapes (
region_id INT REFERENCES regions(id),
grape_id INT REFERENCES grapes(id),
PRIMARY KEY (region_id, grape_id)
);
This is correct, normalized, and the wrong tradeoff for the actual problem. Three reasons:
1. The source data is dirty. A region's grape list comes from Wikipedia infoboxes and WSET regional notes. The same grape shows up as Sangiovese, Sangiovese Grosso, Brunello (which is a clone of Sangiovese), and Prugnolo Gentile (another local name for the same grape). With a normalized table we'd need a curated canonical list and a synonyms table to import into. With a string array we shrug, normalize on read, and ship.
2. The query is the join. The /grapes page reads ~600 region
rows, each averaging maybe 5 grapes in its array. That's 3,000
strings to bucket. JavaScript chews through that in roughly 12 ms.
A SQL GROUP BY grape_id would be faster, but neither is the
bottleneck β both finish before the network round-trip to Supabase
does.
3. Editing is easier. Adding a grape to a region is one update
to a text[] column. Adding a grape variety we don't currently
track is zero schema changes β just include the new name in any
region's array and it appears in /grapes.
The aggregation
The page itself is the schema:
const map = new Map<string, {
display: string,
regions: { name: string; country?: string }[],
countries: Set<string>
}>()
for (const r of rows) {
if (!Array.isArray(r.grapes)) continue
for (const raw of r.grapes) {
const slug = nameToSlug(raw)
if (!slug) continue
const entry = map.get(slug) || { display: raw, regions: [], countries: new Set() }
entry.regions.push({ name: r.name, country: r.countries?.name })
if (r.countries?.name) entry.countries.add(r.countries.name)
map.set(slug, entry)
}
}
The key is nameToSlug(raw), the same one-line normalizer from the
dedup post. So Sangiovese and sangiovese
collapse, but Sangiovese and Sangiovese Grosso stay distinct.
That's the trade-off in plain sight: cheap dedup, no semantic
synonyms.
What we give up
Sorting grapes by global vineyard area would require actual
hectares-planted data, which neither Wikipedia nor OSM has in
machine-readable form. With a real grapes table we could at least
store it once someone hand-curated the numbers. As-is, we sort
by region count, which is a reasonable proxy and demands zero data
entry.
We also can't show "grapes that share a parent variety" without a hand-built synonyms table. Which means Brunello and Sangiovese appear as separate grape pages. The wine purists won't love it. Most visitors won't notice.
The trick this post is actually about
The thing worth taking away: when your data is dirty and your
edits are infrequent, a denormalized array beats a normalized table.
The schema cost of a real grapes table isn't just the two
CREATE TABLE statements β it's the curation pipeline, the synonym
table, the import-time validation, and the dashboard for adding new
varieties. None of that is free. A text[] column is free.
The day we need to ship region-by-region hectares-planted data is the day we add the table. Until then, the array is the schema.
#data-model#grapes#supabase