KEYWORD_INTELLIGENCE — Structure, Maintenance & Downstream Impact

1. Table Overview

listingmanager-1529856313699.fcr_operations.KEYWORD_INTELLIGENCE is the central keyword data store for FCR's digital marketing intelligence. It holds 28.7 million rows of search term data across 8 sources, representing what people search for when looking for services in the categories FCR clients operate in.

Schema (17 columns)

Column Type Description
subscriber_id STRING FCR subscriber ID, or directory listing ID for GP/GetLocal sources
business_name STRING Business name
category STRING Raw category (as received from source — often messy)
city STRING Business city
county STRING Business county
keyword STRING The search term
source STRING Data origin (8 sources — see below)
month DATE Month partition
impressions INT64 Search impressions
clicks INT64 Clicks (null for GMB source)
avg_cpc FLOAT64 Average cost-per-click (Google Ads only)
conversions FLOAT64 Conversions (Google Ads only)
ctr FLOAT64 Click-through rate
normalized_category STRING NEW — Clean canonical category name
keyword_intent STRING NEWcommercial, informational, navigational, or brand
is_fcr_client BOOL NEW — TRUE if subscriber is an active FCR client
source_type STRING NEWpaid, owned_organic, owned_gbp, directory, third_party

8 Data Sources

Source Source Type What It Is Rows Subscribers
gsc_organic owned_organic Google Search Console — FCR-built websites 12.6M 1,702
google_ads paid Google Ads — actual search terms that triggered ads 7.0M 703
gsc_goldenpages directory GSC for Golden Pages — all GP directory listings 3.5M 86,944
gsc_getlocal directory GSC for GetLocal directory 2.7M 1
gmb_search owned_gbp Google Business Profile — discovery keywords 2.5M 4,717
gsc_phonebook directory GSC for Phonebook directory 315K 1
ahrefs_organic third_party Ahrefs — organic keyword rankings 5.7K 267
ahrefs_tracked third_party Ahrefs — manually tracked keywords 3.0K 225

Important distinctions:

  • gsc_goldenpages subscriber_id values are GP listing IDs (86K+), not FCR client IDs. Use is_fcr_client to distinguish.
  • gmb_search has no click data — the GBP Performance API doesn't provide keyword-level clicks.
  • google_ads data is search terms (what people actually typed), not campaign keywords.
  • All data is cumulative over time, not monthly snapshots. Use GROUP BY month for monthly breakdowns.

2. New Columns (Added March 2026)

normalized_category

Problem solved: The raw category field had 12,565 distinct values — many were business names with postcodes ("Cunninghams Funeral Directors Dublin D15"), duplicates with different casing ("Funeral director" vs "Funeral Directors"), or synonyms ("Funeral home" = "Funeral director").

How it works: A lookup table CATEGORY_NORMALISATION maps every raw category to a clean canonical name via rules:

  1. Strip eircode patterns (e.g. D15, A91, T12)
  2. Strip business names ending with Irish town/city names
  3. Filter out very long entries (>50 chars) that are clearly business names
  4. Split comma-separated categories (take first segment)
  5. Normalise case (Title Case)
  6. Merge synonyms (Funeral Home → Funeral Director, Tyre Shop → Tire Shop, etc.)

Result: 12,565 raw categories → 9,447 normalised categories. Unknown/unmappable entries tagged as "Unknown".

keyword_intent

Problem solved: Informational queries (death notices, "what to wear to a funeral") were mixed with commercial queries ("funeral directors near me"), making keyword lists misleading for sales and LRC.

Classification rules (in priority order):

  1. navigational — references rip.ie, goldenpages, google maps, tripadvisor, etc.
  2. informational — death notices, obituaries, how-to/what-is patterns
  3. commercial — "near me", cost/price/quote signals, locality patterns, Google Ads clicks
  4. commercial — all Google Ads impressions (even without clicks)
  5. commercial — all GMB discovery queries (someone looking for the service)
  6. informational — everything else (safe default)

Impact: For Funeral Directors, this separated 8.4M informational impressions (death notices) from 3.4M commercial impressions — the data that actually matters for selling.

is_fcr_client

Problem solved: The gsc_goldenpages source showed 418 "subscribers" for funeral directors — but only ~95 are actual FCR clients. The rest are GP directory listing IDs.

How it works: JOIN against active_clients on subscriber_id. TRUE = active FCR client, FALSE = directory listing or non-client.

source_type

Groups the 8 sources into 5 meaningful tiers for reporting: paid, owned_organic, owned_gbp, directory, third_party.


3. Monthly Maintenance Pipeline

When to Run

After the monthly data load — when new GSC, GA, GMB performance, and Ahrefs data has been inserted into KEYWORD_INTELLIGENCE. The underlying data lands monthly, so normalisation runs monthly.

What to Run

Open the BQ console and run data/keyword-intelligence/00_run_all.sql as a single script. On subsequent runs after the first, it only processes new/unclassified rows (WHERE column IS NULL), so it's fast.

The script executes 6 steps:

Step What Touches
0 Add columns (IF NOT EXISTS — no-op after first run) KEYWORD_INTELLIGENCE schema
1 Update CATEGORY_NORMALISATION + backfill normalized_category CATEGORY_NORMALISATION table + KI rows
2 Classify keyword_intent on new rows KI rows
3 Set is_fcr_client + source_type on new rows KI rows
4 Recreate KEYWORD_INTELLIGENCE_MONTHLY view View (auto-reflects changes)
5 Rebuild CATEGORY_BENCHMARKS + CATEGORY_TOP_KEYWORDS These tables are fully replaced

Adding New Synonyms

When a new category synonym is discovered (e.g. "Heating Engineer" should map to "Plumber"), add a WHEN clause to the synonym CASE block in step 1 of 00_run_all.sql, then re-run. The MERGE will update existing mappings.

Adding New Intent Rules

When a keyword pattern is misclassified (e.g. "dublin deaths" tagged as commercial because it contains a locality), add the pattern to the appropriate REGEXP_CONTAINS block in step 2. New rules apply to all unclassified rows, but to reclassify existing rows, you'd need to NULL out keyword_intent for the affected rows first.


4. Downstream Impact

4.1 Local Rank Check (LRC) — Keyword Selection

Where keywords are chosen: The /prospect skill selects exactly 5 keywords for LRC submission.

Before normalisation: Keywords were selected from categoryKeywords.topKeywords sorted by raw impressions. This meant death notices, rip.ie queries, and other informational noise dominated the top keyword lists for categories like Funeral Directors, Solicitors, and Medical Clinics. The LRC would check rankings for keywords nobody uses to find a funeral director.

After normalisation: The keyword_intent field allows the prospect skill to filter for commercial intent keywords only. The top keywords by impressions are now genuinely the terms people use when looking for the service:

Before (raw) After (intent-filtered)
"rip ie recent deaths south county dublin" (10,800 imp) "funeral directors dublin" (56,965 imp)
"death notices dublin" (9,550 imp) "funeral and cremation price list" (36,462 imp)
"latest death notices" (9,000 imp) "funeral home near me" (23,691 imp)

Action needed: Update the prospect skill to prefer keyword_intent = 'commercial' when selecting keywords. See Report 2.

4.2 Digital Footprint (InSites) — Keywords Parameter

Where keywords are used: When submitting a DF audit via the n8n webhook, the products field takes 3+ keywords. These determine what InSites evaluates the website against.

Impact: Same as LRC — better keyword selection means InSites evaluates the site against commercially relevant terms, not informational noise.

4.3 Ahrefs — Keyword Volume Lookups

Where keywords are used: After LRC keywords are confirmed, the prospect skill queries Ahrefs for national monthly search volume:

mcp__claude_ai_Ahrefs__keywords-explorer-overview
  keywords: {confirmed keywords}
  country: ie

Impact: The keywords sent to Ahrefs are now commercially relevant, so the volume and difficulty data returned is meaningful for the sales pitch. Previously, sending "death notices dublin" to Ahrefs would return high volume but completely irrelevant data.

4.4 Category Benchmarks

Before: CATEGORY_BENCHMARKS was built from raw category values. "Funeral director" and "Funeral home" were separate benchmark entries. Peer subscriber counts included GP directory listings.

After: Benchmarks are built from normalized_category. One "Funeral Director" benchmark row with:

  • total_fcr_subscribers: 95 (actual FCR clients)
  • commercial_keywords: 19,516 / commercial_impressions: 3,369,704
  • informational_keywords: 24,769 / informational_impressions: 8,397,493

This allows the category-keywords skill and prospect-intel endpoint to present accurate peer comparisons.

4.5 Keyword Gap Analysis

Before: Gap analysis compared subscriber keywords against noisy category top keywords. A subscriber might show 90% "capture rate" because they ranked for death notices, but miss the commercial terms that matter.

After: CATEGORY_TOP_KEYWORDS now includes keyword_intent per keyword. Gap analysis can focus on commercial keyword gaps — the ones that actually represent missed business opportunities.

4.6 Google Keyword Planner Seeds

Where used: prospect-intel.js takes top BQ keywords and sends them to Google Keyword Planner for county-level monthly volumes.

Impact: KP seed keywords are now cleaner. Previously, informational queries could leak into the seed list, returning irrelevant local volume data.


5. Data Quality Notes

Known Limitations

  • GMB has no click data. Every gmb_search row has clicks = null. This is a Google API limitation, not a data issue.
  • Ahrefs coverage is thin. Only 267 subscribers with 1 month of data (Feb 2026). Will improve over time.
  • Intent classification is rule-based. Some edge cases will be misclassified (e.g. "kildare deaths" tagged commercial because it contains a locality). Refine rules monthly as patterns are discovered.
  • "Unknown" categories. 2.7M rows couldn't be normalised (empty raw category or business-name entries). These are excluded from benchmarks and views but the underlying data is preserved.

Validation Query

Run this monthly after the pipeline to check health:

SELECT
  COUNTIF(normalized_category IS NULL) as null_norm,
  COUNTIF(keyword_intent IS NULL) as null_intent,
  COUNTIF(is_fcr_client IS NULL) as null_fcr,
  COUNTIF(source_type IS NULL) as null_source,
  COUNT(*) as total
FROM `listingmanager-1529856313699.fcr_operations.KEYWORD_INTELLIGENCE`

All four null counts should be 0 after a complete run.

FCR Dashboard documentation · generated from docs/ · keep counts verified, not guessed.