Leveraging GetLocal Product Data for Keyword Intelligence

Goal. GetLocal (getlocal.ie) search demand currently enters the Keyword Intelligence KB as an anonymous gsc_getlocal blob — ~2.7M rows, all tagged to one "subscriber", with no idea which seller or category the demand belongs to. This plan documents what we need to do to attribute that demand to a seller and the seller's categories, by parsing the getlocal.ie URL and mapping the product slug → seller → categories.

Status: planning. Verified against live data 2026-05-20.

Inputs (what we feed in)

Fresh slug extract from the GetLocal developer — canonical; this replaces the legacy GL_REFDATA.ED_* / PRODUCT_GPI_MAP tables. Do NOT use the ED data (that was an older, separate workstream — pre‑2021, ~50% of products uncategorised, ~1M slugs unattached to a seller). Expected per product: slug (the product_shortcode), seller_id (the GPI listing id, e.g. IE_13193645_617610_14295), store slug/name, and the product's category(ies). → confirm exact schema + refresh cadence with the GL dev.
GSC for getlocal.ie — Search Console export. We need the page (URL) dimension, not just the query, so every impression/click is tied to a getlocal.ie URL we can parse. → confirm we can pull page‑level GSC.
Seller → categories — from the slug extract if present, otherwise from GL_REFDATA.gl_all_shops (gpi_store_maincategorylongname, ~4.4k stores). → decide source.

The getlocal.ie URL taxonomy (what the URL tells us)

Every getlocal.ie URL is one of five types. This is grounded from the GL sitemap crawl (GL_REFDATA.GL_SITEMAPS) — the URL shapes are canonical:

Type	URL pattern	Extract	Resolves to
Product (slug)	`/product/{slug}/{pretty-name}`	`{slug}` (= `product_shortcode`)	one product → one seller → categories
Store	`/store/{store-slug}`	`{store-slug}`	one seller directly
Browse category (product)	`/browse-category/{cat}/all/ireland` · `/browse-category/{cat}/near/county-{county}`	`{cat}` (+ county)	a product category (+ area) — many sellers
Browse store category	`/browse-store-category/{cat}/all/ireland`	`{cat}`	a store category — many sellers
Search / Q page	`/q/{term}/all/ireland/in/{category}`	`{term}` (+ `{category}`)	a search query scoped to a category

Real examples:

/product/91upy/inglot-smoulder-flutter-eye-set            → slug 91upy
/store/milltown-painting                                  → store-slug milltown-painting
/browse-category/socks/near/county-donegal               → category socks, county donegal
/browse-store-category/bouncing-castles-inflatables/all/ireland → store-cat bouncing-castles-inflatables
/q/oven/all/ireland/in/kitchen-and-dining                → query "oven" in kitchen-and-dining

/product/{slug}/… is ~95% of indexed URLs (20.9M of ~22M), so the slug path is where most of the attribution work pays off.

The mapping chain

GSC row (getlocal.ie URL + query + impressions/clicks)
  │  1. classify URL → {product | store | browse-category | browse-store-category | q}
  │  2. extract identifier (slug / store-slug / category / term)
  ▼
  ├─ product  : slug ──(slug extract)──▶ seller_id ──▶ seller's categories
  ├─ store    : store-slug ─────────────▶ seller_id ──▶ seller's categories
  └─ category / q : no single seller ──▶ attribute to the CATEGORY (+ county / term)

Classify the URL (regex per pattern above).
Extract the identifier.
Product / store URLs → resolve to a single seller_id, then to that seller's categories (per‑seller demand).
Browse‑category / q URLs → no single seller; attribute to the category (and county / query term). This is category‑level demand — the most directly useful signal for KI category benchmarks and "what customers care about".

What it unlocks

Per‑seller demand (product/store URLs): "what searches drive this GetLocal seller's visibility" → cross‑sell angles, seller‑side recommendations.
Category‑level demand (browse/q URLs): feeds CATEGORY_* benchmarks and the consumer‑priorities ("what customers care about") inputs.
Replaces the anonymous gsc_getlocal directory blob in KEYWORD_INTELLIGENCE with seller‑ and category‑attributed rows (keep source = 'gsc_getlocal', add a seller_id, populate normalized_category). See data-feeds-and-knowledge.md and the keyword‑KB pipeline (../data/keyword-intelligence/README.md).

Build steps (when greenlit)

Land the GL dev's slug extract into a BQ table, e.g. fcr_operations.getlocal_slug_map (slug, seller_id, store_slug, category/categories), with a refresh job.
URL classifier — SQL REGEXP_EXTRACTs (or a small worker fn) → url_type
- identifier + optional county / in_category.
Join GSC(page) → classifier → getlocal_slug_map → seller_id + categories.
Write attributed rows into the KI pipeline (extend data/keyword-intelligence/ step 03, source‑type classification).
(Optional) seller‑scoped keyword views in the dashboard / advisor.

Open questions

Exact slug‑extract schema + refresh cadence from the GL dev.
Page‑level GSC for getlocal.ie — available, or query‑level only?
Store‑slug → seller_id — is store‑slug in the extract, or derive from gl_all_shops?
Category bridge — GetLocal browse‑category slugs vs our normalized_category / CATEGORY_MAPPING: do we need a GL‑category → our‑category mapping too?

Related: bigquery-and-sync.md (GL data lives in the separate GL_REFDATA BQ dataset, not in fcr_operations/the repo); data-feeds-and-knowledge.md.

FCR Dashboard documentation · generated from docs/ · keep counts verified, not guessed.