KEYWORD_INTELLIGENCE — Structure, Maintenance & Downstream Impact
1. Table Overview
listingmanager-1529856313699.fcr_operations.KEYWORD_INTELLIGENCE is the central keyword data store for FCR's digital marketing intelligence. It holds 28.7 million rows of search term data across 8 sources, representing what people search for when looking for services in the categories FCR clients operate in.
Schema (17 columns)
| Column | Type | Description |
|---|---|---|
subscriber_id |
STRING | FCR subscriber ID, or directory listing ID for GP/GetLocal sources |
business_name |
STRING | Business name |
category |
STRING | Raw category (as received from source — often messy) |
city |
STRING | Business city |
county |
STRING | Business county |
keyword |
STRING | The search term |
source |
STRING | Data origin (8 sources — see below) |
month |
DATE | Month partition |
impressions |
INT64 | Search impressions |
clicks |
INT64 | Clicks (null for GMB source) |
avg_cpc |
FLOAT64 | Average cost-per-click (Google Ads only) |
conversions |
FLOAT64 | Conversions (Google Ads only) |
ctr |
FLOAT64 | Click-through rate |
normalized_category |
STRING | NEW — Clean canonical category name |
keyword_intent |
STRING | NEW — commercial, informational, navigational, or brand |
is_fcr_client |
BOOL | NEW — TRUE if subscriber is an active FCR client |
source_type |
STRING | NEW — paid, owned_organic, owned_gbp, directory, third_party |
8 Data Sources
| Source | Source Type | What It Is | Rows | Subscribers |
|---|---|---|---|---|
gsc_organic |
owned_organic | Google Search Console — FCR-built websites | 12.6M | 1,702 |
google_ads |
paid | Google Ads — actual search terms that triggered ads | 7.0M | 703 |
gsc_goldenpages |
directory | GSC for Golden Pages — all GP directory listings | 3.5M | 86,944 |
gsc_getlocal |
directory | GSC for GetLocal directory | 2.7M | 1 |
gmb_search |
owned_gbp | Google Business Profile — discovery keywords | 2.5M | 4,717 |
gsc_phonebook |
directory | GSC for Phonebook directory | 315K | 1 |
ahrefs_organic |
third_party | Ahrefs — organic keyword rankings | 5.7K | 267 |
ahrefs_tracked |
third_party | Ahrefs — manually tracked keywords | 3.0K | 225 |
Important distinctions:
gsc_goldenpagessubscriber_id values are GP listing IDs (86K+), not FCR client IDs. Useis_fcr_clientto distinguish.gmb_searchhas no click data — the GBP Performance API doesn't provide keyword-level clicks.google_adsdata is search terms (what people actually typed), not campaign keywords.- All data is cumulative over time, not monthly snapshots. Use
GROUP BY monthfor monthly breakdowns.
2. New Columns (Added March 2026)
normalized_category
Problem solved: The raw category field had 12,565 distinct values — many were business names with postcodes ("Cunninghams Funeral Directors Dublin D15"), duplicates with different casing ("Funeral director" vs "Funeral Directors"), or synonyms ("Funeral home" = "Funeral director").
How it works: A lookup table CATEGORY_NORMALISATION maps every raw category to a clean canonical name via rules:
- Strip eircode patterns (e.g. D15, A91, T12)
- Strip business names ending with Irish town/city names
- Filter out very long entries (>50 chars) that are clearly business names
- Split comma-separated categories (take first segment)
- Normalise case (Title Case)
- Merge synonyms (Funeral Home → Funeral Director, Tyre Shop → Tire Shop, etc.)
Result: 12,565 raw categories → 9,447 normalised categories. Unknown/unmappable entries tagged as "Unknown".
keyword_intent
Problem solved: Informational queries (death notices, "what to wear to a funeral") were mixed with commercial queries ("funeral directors near me"), making keyword lists misleading for sales and LRC.
Classification rules (in priority order):
- navigational — references rip.ie, goldenpages, google maps, tripadvisor, etc.
- informational — death notices, obituaries, how-to/what-is patterns
- commercial — "near me", cost/price/quote signals, locality patterns, Google Ads clicks
- commercial — all Google Ads impressions (even without clicks)
- commercial — all GMB discovery queries (someone looking for the service)
- informational — everything else (safe default)
Impact: For Funeral Directors, this separated 8.4M informational impressions (death notices) from 3.4M commercial impressions — the data that actually matters for selling.
is_fcr_client
Problem solved: The gsc_goldenpages source showed 418 "subscribers" for funeral directors — but only ~95 are actual FCR clients. The rest are GP directory listing IDs.
How it works: JOIN against active_clients on subscriber_id. TRUE = active FCR client, FALSE = directory listing or non-client.
source_type
Groups the 8 sources into 5 meaningful tiers for reporting: paid, owned_organic, owned_gbp, directory, third_party.
3. Monthly Maintenance Pipeline
When to Run
After the monthly data load — when new GSC, GA, GMB performance, and Ahrefs data has been inserted into KEYWORD_INTELLIGENCE. The underlying data lands monthly, so normalisation runs monthly.
What to Run
Open the BQ console and run data/keyword-intelligence/00_run_all.sql as a single script. On subsequent runs after the first, it only processes new/unclassified rows (WHERE column IS NULL), so it's fast.
The script executes 6 steps:
| Step | What | Touches |
|---|---|---|
| 0 | Add columns (IF NOT EXISTS — no-op after first run) | KEYWORD_INTELLIGENCE schema |
| 1 | Update CATEGORY_NORMALISATION + backfill normalized_category |
CATEGORY_NORMALISATION table + KI rows |
| 2 | Classify keyword_intent on new rows |
KI rows |
| 3 | Set is_fcr_client + source_type on new rows |
KI rows |
| 4 | Recreate KEYWORD_INTELLIGENCE_MONTHLY view |
View (auto-reflects changes) |
| 5 | Rebuild CATEGORY_BENCHMARKS + CATEGORY_TOP_KEYWORDS |
These tables are fully replaced |
Adding New Synonyms
When a new category synonym is discovered (e.g. "Heating Engineer" should map to "Plumber"), add a WHEN clause to the synonym CASE block in step 1 of 00_run_all.sql, then re-run. The MERGE will update existing mappings.
Adding New Intent Rules
When a keyword pattern is misclassified (e.g. "dublin deaths" tagged as commercial because it contains a locality), add the pattern to the appropriate REGEXP_CONTAINS block in step 2. New rules apply to all unclassified rows, but to reclassify existing rows, you'd need to NULL out keyword_intent for the affected rows first.
4. Downstream Impact
4.1 Local Rank Check (LRC) — Keyword Selection
Where keywords are chosen: The /prospect skill selects exactly 5 keywords for LRC submission.
Before normalisation: Keywords were selected from categoryKeywords.topKeywords sorted by raw impressions. This meant death notices, rip.ie queries, and other informational noise dominated the top keyword lists for categories like Funeral Directors, Solicitors, and Medical Clinics. The LRC would check rankings for keywords nobody uses to find a funeral director.
After normalisation: The keyword_intent field allows the prospect skill to filter for commercial intent keywords only. The top keywords by impressions are now genuinely the terms people use when looking for the service:
| Before (raw) | After (intent-filtered) |
|---|---|
| "rip ie recent deaths south county dublin" (10,800 imp) | "funeral directors dublin" (56,965 imp) |
| "death notices dublin" (9,550 imp) | "funeral and cremation price list" (36,462 imp) |
| "latest death notices" (9,000 imp) | "funeral home near me" (23,691 imp) |
Action needed: Update the prospect skill to prefer keyword_intent = 'commercial' when selecting keywords. See Report 2.
4.2 Digital Footprint (InSites) — Keywords Parameter
Where keywords are used: When submitting a DF audit via the n8n webhook, the products field takes 3+ keywords. These determine what InSites evaluates the website against.
Impact: Same as LRC — better keyword selection means InSites evaluates the site against commercially relevant terms, not informational noise.
4.3 Ahrefs — Keyword Volume Lookups
Where keywords are used: After LRC keywords are confirmed, the prospect skill queries Ahrefs for national monthly search volume:
mcp__claude_ai_Ahrefs__keywords-explorer-overview
keywords: {confirmed keywords}
country: ie
Impact: The keywords sent to Ahrefs are now commercially relevant, so the volume and difficulty data returned is meaningful for the sales pitch. Previously, sending "death notices dublin" to Ahrefs would return high volume but completely irrelevant data.
4.4 Category Benchmarks
Before: CATEGORY_BENCHMARKS was built from raw category values. "Funeral director" and "Funeral home" were separate benchmark entries. Peer subscriber counts included GP directory listings.
After: Benchmarks are built from normalized_category. One "Funeral Director" benchmark row with:
total_fcr_subscribers: 95(actual FCR clients)commercial_keywords: 19,516/commercial_impressions: 3,369,704informational_keywords: 24,769/informational_impressions: 8,397,493
This allows the category-keywords skill and prospect-intel endpoint to present accurate peer comparisons.
4.5 Keyword Gap Analysis
Before: Gap analysis compared subscriber keywords against noisy category top keywords. A subscriber might show 90% "capture rate" because they ranked for death notices, but miss the commercial terms that matter.
After: CATEGORY_TOP_KEYWORDS now includes keyword_intent per keyword. Gap analysis can focus on commercial keyword gaps — the ones that actually represent missed business opportunities.
4.6 Google Keyword Planner Seeds
Where used: prospect-intel.js takes top BQ keywords and sends them to Google Keyword Planner for county-level monthly volumes.
Impact: KP seed keywords are now cleaner. Previously, informational queries could leak into the seed list, returning irrelevant local volume data.
5. Data Quality Notes
Known Limitations
- GMB has no click data. Every
gmb_searchrow hasclicks = null. This is a Google API limitation, not a data issue. - Ahrefs coverage is thin. Only 267 subscribers with 1 month of data (Feb 2026). Will improve over time.
- Intent classification is rule-based. Some edge cases will be misclassified (e.g. "kildare deaths" tagged commercial because it contains a locality). Refine rules monthly as patterns are discovered.
- "Unknown" categories. 2.7M rows couldn't be normalised (empty raw category or business-name entries). These are excluded from benchmarks and views but the underlying data is preserved.
Validation Query
Run this monthly after the pipeline to check health:
SELECT
COUNTIF(normalized_category IS NULL) as null_norm,
COUNTIF(keyword_intent IS NULL) as null_intent,
COUNTIF(is_fcr_client IS NULL) as null_fcr,
COUNTIF(source_type IS NULL) as null_source,
COUNT(*) as total
FROM `listingmanager-1529856313699.fcr_operations.KEYWORD_INTELLIGENCE`
All four null counts should be 0 after a complete run.
FCR Dashboard documentation · generated from docs/ · keep counts verified, not guessed.