Leveraging GetLocal Product Data for Keyword Intelligence
Goal. GetLocal (getlocal.ie) search demand currently enters the Keyword Intelligence KB as an anonymous
gsc_getlocalblob — ~2.7M rows, all tagged to one "subscriber", with no idea which seller or category the demand belongs to. This plan documents what we need to do to attribute that demand to a seller and the seller's categories, by parsing the getlocal.ie URL and mapping the product slug → seller → categories.Status: planning. Verified against live data 2026-05-20.
Inputs (what we feed in)
- Fresh slug extract from the GetLocal developer — canonical; this
replaces the legacy
GL_REFDATA.ED_*/PRODUCT_GPI_MAPtables. Do NOT use the ED data (that was an older, separate workstream — pre‑2021, ~50% of products uncategorised, ~1M slugs unattached to a seller). Expected per product:slug(theproduct_shortcode),seller_id(the GPI listing id, e.g.IE_13193645_617610_14295), store slug/name, and the product's category(ies). → confirm exact schema + refresh cadence with the GL dev. - GSC for getlocal.ie — Search Console export. We need the page (URL) dimension, not just the query, so every impression/click is tied to a getlocal.ie URL we can parse. → confirm we can pull page‑level GSC.
- Seller → categories — from the slug extract if present, otherwise from
GL_REFDATA.gl_all_shops(gpi_store_maincategorylongname, ~4.4k stores). → decide source.
The getlocal.ie URL taxonomy (what the URL tells us)
Every getlocal.ie URL is one of five types. This is grounded from the GL sitemap
crawl (GL_REFDATA.GL_SITEMAPS) — the URL shapes are canonical:
| Type | URL pattern | Extract | Resolves to |
|---|---|---|---|
| Product (slug) | /product/{slug}/{pretty-name} |
{slug} (= product_shortcode) |
one product → one seller → categories |
| Store | /store/{store-slug} |
{store-slug} |
one seller directly |
| Browse category (product) | /browse-category/{cat}/all/ireland · /browse-category/{cat}/near/county-{county} |
{cat} (+ county) |
a product category (+ area) — many sellers |
| Browse store category | /browse-store-category/{cat}/all/ireland |
{cat} |
a store category — many sellers |
| Search / Q page | /q/{term}/all/ireland/in/{category} |
{term} (+ {category}) |
a search query scoped to a category |
Real examples:
/product/91upy/inglot-smoulder-flutter-eye-set → slug 91upy
/store/milltown-painting → store-slug milltown-painting
/browse-category/socks/near/county-donegal → category socks, county donegal
/browse-store-category/bouncing-castles-inflatables/all/ireland → store-cat bouncing-castles-inflatables
/q/oven/all/ireland/in/kitchen-and-dining → query "oven" in kitchen-and-dining
/product/{slug}/… is ~95% of indexed URLs (20.9M of ~22M), so the slug path is
where most of the attribution work pays off.
The mapping chain
GSC row (getlocal.ie URL + query + impressions/clicks)
│ 1. classify URL → {product | store | browse-category | browse-store-category | q}
│ 2. extract identifier (slug / store-slug / category / term)
▼
├─ product : slug ──(slug extract)──▶ seller_id ──▶ seller's categories
├─ store : store-slug ─────────────▶ seller_id ──▶ seller's categories
└─ category / q : no single seller ──▶ attribute to the CATEGORY (+ county / term)
- Classify the URL (regex per pattern above).
- Extract the identifier.
- Product / store URLs → resolve to a single
seller_id, then to that seller's categories (per‑seller demand). - Browse‑category / q URLs → no single seller; attribute to the category (and county / query term). This is category‑level demand — the most directly useful signal for KI category benchmarks and "what customers care about".
What it unlocks
- Per‑seller demand (product/store URLs): "what searches drive this GetLocal seller's visibility" → cross‑sell angles, seller‑side recommendations.
- Category‑level demand (browse/q URLs): feeds
CATEGORY_*benchmarks and the consumer‑priorities ("what customers care about") inputs. - Replaces the anonymous
gsc_getlocaldirectory blob inKEYWORD_INTELLIGENCEwith seller‑ and category‑attributed rows (keepsource = 'gsc_getlocal', add aseller_id, populatenormalized_category). Seedata-feeds-and-knowledge.mdand the keyword‑KB pipeline (../data/keyword-intelligence/README.md).
Build steps (when greenlit)
- Land the GL dev's slug extract into a BQ table, e.g.
fcr_operations.getlocal_slug_map(slug,seller_id,store_slug,category/categories), with a refresh job. - URL classifier — SQL
REGEXP_EXTRACTs (or a small worker fn) →url_typeidentifier+ optionalcounty/in_category.
- Join GSC(page) → classifier →
getlocal_slug_map→seller_id+ categories. - Write attributed rows into the KI pipeline (extend
data/keyword-intelligence/step 03, source‑type classification). - (Optional) seller‑scoped keyword views in the dashboard / advisor.
Open questions
- Exact slug‑extract schema + refresh cadence from the GL dev.
- Page‑level GSC for getlocal.ie — available, or query‑level only?
- Store‑slug → seller_id — is store‑slug in the extract, or derive from
gl_all_shops? - Category bridge — GetLocal browse‑category slugs vs our
normalized_category/CATEGORY_MAPPING: do we need a GL‑category → our‑category mapping too?
Related: bigquery-and-sync.md (GL data lives in the
separate GL_REFDATA BQ dataset, not in fcr_operations/the repo);
data-feeds-and-knowledge.md.
FCR Dashboard documentation · generated from docs/ · keep counts verified, not guessed.