Pipeline Overview
When a bookmark is enqueued for enrichment, it passes through a multi-stage pipeline:Stage 1: Content Fetching
The domain router selects the best fetching strategy based on the URL’s domain:| Priority | Strategy | Handles | Method |
|---|---|---|---|
| 1 | Instagram Fetcher | instagram.com | Platform-specific (limited due to restrictions) |
| 2 | oEmbed Fetcher | YouTube, TikTok, Vimeo, X | oEmbed API for rich metadata |
| 3 | Metascraper | Regular websites | HTML meta tags (Open Graph, Twitter Cards) |
| 4 | Fallback | Everything else | Regex-based extraction from raw HTML |
PageContext:
Stage 2: AI Analysis
The page context is sent to an AI model for enrichment: Primary: Gemini AI (Vertex AI)- Model:
gemini-2.5-flash - Temperature: 0.2 (factual, low creativity)
- Max output: 256 tokens
- System prompt: “Enrich saved bookmarks for a personal library. Return compact, factual metadata only.”
| Field | Heuristic |
|---|---|
summary | Truncated description (140 chars) or title |
saveWhy | Keyword matching (e.g., “recipe” → “Cooking inspiration”) |
tags | Extracted from domain + title + description words |
Stage 3: Write Results
On completion, the worker:- Updates the bookmark — sets
enrichmentStatus: completed, writes summary/tags/saveWhy, setsenrichedAt - Updates the job — marks
status: completed - Writes to enrichment cache — keyed by
hash(normalizedUrl)for global dedup
Global Cache
The enrichment cache is shared across all users:- Key: Hash of the normalized URL
- Value: Full enrichment result (title, summary, tags, etc.)
- Effect: Second save of the same URL skips the entire pipeline — result is applied instantly from cache
Retry Policy
| Attempt | Backoff | Action |
|---|---|---|
| 1st failure | 1 minute | Retry if error is retriable |
| 2nd failure | 5 minutes | Retry if error is retriable |
| 3rd failure | — | Permanent failure |
AbortError, connect-failed, dns-failed, enrichment-failed, fetch-timeout, http-5xx, provider-rate-limited, stale-job-timeout
Non-retriable errors (immediate fail):
blocked-host, http-4xx, invalid-body, quota-exhausted
Job Status Lifecycle
Quota Enforcement
Quotas are checked at enqueue time (not during processing):| Check | Free Limit | Error |
|---|---|---|
| Monthly enrichments | 200 | quota-exhausted (429) |
| Per-minute rate | 5 | rate-limited (429) |
| Pro users | Unlimited | — |