Image Normalizer — HEIC/AVIF/CMYK Fallback

A tiny Modal-hosted Python service the dashboard worker falls through to when an uploaded image is in a format Anthropic can't read (HEIC, AVIF, TIFF, CMYK JPEG, palette PNG…) or exceeds the worker's 4.5 MB inline cap.


Why this exists

The advisor accepts dropped images via two paths inside worker/src/handlers/ai-advisor.js:

  • hydrateUserImages — images a rep manually drops in chat
  • hydrateHubspotImages — images auto-pulled from the HubSpot card's notes/emails

Both end with a magic-byte check (detectImageMimeFromBytes) that only recognises PNG / JPEG / GIF / WebP, plus a hard 4.5 MB byte cap. Anything else was silently dropped (HubSpot path) or surfaced to the rep as "couldn't load — too large or unsupported format" (drop path).

Two common real-world cases that hit this:

  • AVIF logos exported from modern design tools or downloaded from a phone gallery. The magic-byte check returns null → drop.
  • HEIC photos straight from an iPhone. Same story.

Worse, these failures also starved open_studio — the advisor's "carry your chat-dropped images across to Creator Studio" hand-off only carries what hydrate produces. An AVIF logo that never hydrated had nothing to hand across, so the rep landed in Studio half-empty and had to re-upload.

Cloudflare Workers can't run Pillow themselves (no CPython, no native libheif/libavif). So this lives as a separate service the worker calls only on the failure path.


What it does

POST /normalize

Body Raw image bytes (any format Pillow + pillow-heif + pillow-avif-plugin can decode)
Auth Authorization: Bearer <FCR_NORMALIZER_TOKEN>
Optional X-Max-Width (default 1568), X-Original-Url (logging only)
Returns Raw normalized bytes — image/png if alpha was present, image/jpeg (q=85) otherwise
Side info Response headers X-Original-Format, X-Original-Mode, X-Original-Bytes, X-Normalized-Bytes, X-Output-Width, X-Output-Height
Hard cap 16 MB input, 30 s timeout

GET /health{ ok: true, service: "fcr-image-normalizer" }.

The output is also pre-downscaled to 1568 px wide (TARGET_IMAGE_WIDTH in the worker — Anthropic's recommended vision dimension). The worker does not need a second resize step.


When the worker actually calls it

The hot path is unchanged. PNG/JPEG/GIF/WebP under 4.5 MB returns immediately, no round-trip.

The worker only POSTs to the normalizer when:

  1. detectImageMimeFromBytes(buf) returns null (unsupported format), OR
  2. buf.byteLength > MAX_IMAGE_BYTES (too large after any cf.image step)

Wired into three call sites in ai-advisor.js:

  • hydrateUserImages — R2 branch (Prospect Snap captures from /serp-snapshot/)
  • hydrateUserImages — fetch branch (anything else: Roam card CDN, HubSpot user-content CDN, etc.)
  • hydrateHubspotImages — the card auto-pull, which previously had no resize/normalize at all and silently dropped large or unsupported attachments

The helper:

async function normalizeImageBytes(buf, env, sourceUrl = "") {
  if (!env || !env.IMAGE_NORMALIZER_URL || !env.IMAGE_NORMALIZER_TOKEN) return null;
  // POST raw bytes to <URL>/normalize with bearer auth, return { media_type, data } or null
}

Graceful degradation. If IMAGE_NORMALIZER_URL / IMAGE_NORMALIZER_TOKEN aren't set, the helper returns null and the calling site falls through to today's not_a_supported_image_format / too_large error. So the worker is safe to deploy ahead of the Modal service — pre-secrets behaviour is identical to the previous build.


Where it lives

Piece Path
Modal app source services/image-normalizer/normalize.py
Deploy + secret-setting README services/image-normalizer/README.md
Worker integration worker/src/handlers/ai-advisor.jsnormalizeImageBytes() + 3 call sites
Worker secrets (both accounts) IMAGE_NORMALIZER_URL, IMAGE_NORMALIZER_TOKEN
Modal secret image-normalizerFCR_NORMALIZER_TOKEN
Hosted at A *.modal.run URL printed by modal deploy

For the actual deploy steps (Modal install, token, secrets, wrangler), see services/image-normalizer/README.md — that's the operational source of truth.


Why Modal, not something else

Cloudflare Workers can't run Pillow. Options considered:

Option Why not (now)
Reuse an existing FCR Python box No catalogued Python service to host this on. A GetLocal GPU box exists for Whisper, but adding an HTTP service to it crosses a domain boundary.
Fly.io / Render small container Always-on, $5–10/mo, predictable — but a new vendor and ops burden for a fallback that fires rarely.
WASM libheif / libavif inside the worker Largest build effort; bigger bundle; ties decoding versions to worker deploys.
Modal (chosen) Scale-to-zero (no idle cost for a fallback path), single Python file deploys with modal deploy, managed runtime keeps libheif/libavif up to date. Tradeoff is a managed-vendor dependency and a cold start.

Operational notes

  • Cold start ≈ 5–10 s on the first call after idle. Acceptable for an infrequent fallback. If reps start noticing latency, add keep_warm=1 to the @app.function decorator in normalize.py.
  • No worker-side cache yet. Re-dropping the same image re-pays the round-trip. A KV cache by source-URL hash is a cheap follow-up if usage warrants.
  • Secret-setting on Cloudflare uses printf '%s' '…' | wrangler secret put …. Piping a PowerShell $var into wrangler secret put appends \r\n into the secret value and silently breaks header auth — bash printf '%s' is the safe form.
  • No worker redeploy needed after secrets land. The fallback engages on the next request.

What it does NOT handle

  • SVG — Pillow doesn't rasterize SVG. Vector logos still fail; reps need to export to PNG/JPG. cairosvg / svglib are the add when this bites.
  • PDF — same story.

These are deliberate non-goals for the v1. Surface a request if they start blocking real work.


  • docs/external-apis.md — the broader external-services map (Modal is now one of them).
  • docs/worker-endpoints.md — the worker side of the integration.
  • services/image-normalizer/README.md — deploy steps + smoke test.
  • worker/src/handlers/ai-advisor.jsnormalizeImageBytes() (~line 800) and the three call sites in hydrateUserImages + hydrateHubspotImages.

FCR Dashboard documentation · generated from docs/ · keep counts verified, not guessed.

Ask the docsRAG over this site
Ask anything about the FCR Dashboard platform — architecture, BigQuery, the worker routes, billing rules, the LRC stack, scoring… Answers are grounded in this documentation, with source links.
How does the deal-brief refresh work? Which routes are Worker vs n8n? How is account health scored?