PROSPECT_LISTINGS Pleper refresh — design notes
Built: 2026-05-17 (see commits 70ecde5, ad2ffdd) Status: Live on FCR Media production via Cloudflare Worker cron.
What it does
Keeps Pleper-derived fields on fcr_operations.PROSPECT_LISTINGS (service_count, photo_count) and the GBP_SERVICES detail rows reasonably fresh, without burning Pleper credits on dormant pids.
Architecture
- Cloudflare Worker cron trigger (
* * * * *, prod only —[env.fcr.triggers]inworker/wrangler.toml) callsscheduled()inworker/src/index.jsevery minute. scheduled()invokesrunPleperRefresh(env, {})fromworker/src/handlers/prospect-pleper-refresh.js.- Each tick processes one stale pid (
MAX_BATCH_SIZE=1) — single-pid budget fits the worker wall-clock cap with Pleper's worst-case ~28s polling. - HTTP endpoint
/dashboard-prospect-pleper-refreshexists for ad-hoc / debug calls (same logic, manual trigger).
Staleness signal
Candidate query (in prospect-pleper-refresh.js):
LEFT JOIN (SELECT place_id, MAX(scraped_date) AS d FROM GBP_SERVICES GROUP BY place_id) s
WHERE STARTS_WITH(p.place_id, "ChIJ") OR STARTS_WITH(p.place_id, "0x") -- skip junk pids
AND (s.d IS NULL OR s.d < CURRENT_DATE() - INTERVAL 30 DAY) -- stale or never scraped
AND (p.pleper_error_at IS NULL OR p.pleper_error_at < NOW() - INTERVAL 7 DAY) -- back-off
A pid is "stale" when:
- It has no row in
GBP_SERVICES, OR - Its newest
GBP_SERVICES.scraped_dateis more than 30 days old
Pids that just failed Pleper get a 7-day back-off (won't be retried until then).
Schema
PROSPECT_LISTINGS gained three columns (2026-05-17):
| Column | Type | Purpose |
|---|---|---|
last_updated |
TIMESTAMP | Bumped on every successful write (SerpAPI batch, Pleper batch/live, competitor-enrich) |
pleper_error |
STRING | Last Pleper services-call error (NULL = no current error) |
pleper_error_at |
TIMESTAMP | When the failure happened — drives 7-day back-off |
Backfilled last_updated = created_at on all 15,343 existing rows.
Per-tick behaviour
| Pleper outcome | last_updated |
pleper_error |
pleper_error_at |
GBP_SERVICES |
|---|---|---|---|---|
| Services OK, N > 0 services | now | NULL (cleared) | NULL (cleared) | N new rows written |
| Services OK, 0 services | now | NULL (cleared) | NULL (cleared) | — (no rows; pid will re-fetch in 30d) |
| Services API failure | unchanged | error msg | now | — |
The third path is the back-off lever: rows that genuinely can't be Pleper-scraped get marked + skipped for a week so we don't burn credits hammering them.
photo_count is updated when Pleper /images succeeds (separate failure mode from /services; an images-only failure is silent — counts as partial success).
Live competitor-enrich writeback
worker/src/handlers/competitor-enrich.js was extended to mirror the same writeback discipline for cached pids that the live (not batch) path Pleper-hits:
- Cached pid + Pleper success → UPDATE PROSPECT_LISTINGS (service_count, photo_count, last_updated, clear error)
- Cached pid + Pleper failure → UPDATE PROSPECT_LISTINGS (pleper_error, pleper_error_at) — do NOT bump last_updated
- New pid → INSERT row with
last_updated=nowIso(was already inserting, just stamps the column now)
UPDATEs fire via ctx.waitUntil so the response doesn't wait on DML.
Existing SerpAPI batch (/dashboard-prospect-enrich)
- The handler still exists and now also bumps
last_updatedon its UPDATE. - The legacy n8n workflow
JR8eoO4Cc1CiThyB("GBP Enrichment - Batch Processor") is inactive and errored on its last 56 executions (2026-04-17). Out of scope for this work — separate decision needed on whether to rebuild or replace.
What's intentionally not done
- Zero-services sentinel rows (option
afrom the design conversation) — accepted that pids returning zero services will be re-checked monthly. Pleper credits are cheap and zero-result pids are common. - Refreshing the dormant ~15k rows on a tighter cadence — every-minute cron + 30-day stale window naturally drains the table over ~10-15 days then idles.
- SerpAPI batch revival — separate decision.
Files touched
worker/wrangler.toml—[env.fcr.triggers]cron block (prod only)worker/src/index.js—scheduled()handler + handler import + route listworker/src/handlers/prospect-pleper-refresh.js— new (runPleperRefreshexported +/dashboard-prospect-pleper-refreshHTTP wrapper)worker/src/handlers/competitor-enrich.js— cached UPDATE path + last_updated stamp + error trackingworker/src/handlers/prospect-enrich.js— UPDATE bumps last_updated
Operating notes
- Watch via
wrangler tail --env fcrfor the[scheduled pleper-refresh] processed=...log lines. - Manual kick:
curl -X POST /dashboard-prospect-pleper-refresh -H 'x-api-key: ...'(returns one pid's worth of work). - To pause: comment out or remove
[env.fcr.triggers]inwrangler.tomland redeploy. - 9 junk place_ids (town names, etc.) are in the table — filtered by
STARTS_WITH(ChIJ|0x)so the cron skips them. Separate cleanup TODO.
FCR Dashboard documentation · generated from docs/ · keep counts verified, not guessed.