Jump to content

Scrapbook-core

From Archive

Scrapbook-core

A searchable archive of everything read, starred, and shared online - and why 9,485 scraps is both too much and not enough.

The Problem

The frustration that drove this project:

  • Losing articles read 6 months ago
  • Can't remember where "that thing about X" came from
  • Bookmarks accumulate but never get searched
  • Browser history becomes useless after a certain point

If I star something, bookmark something, or post about it - it mattered to me. It should be as easy to search my own history as it is to Google something new.

What It Actually Does

Data Sources

Pulls from multiple platforms where content gets saved or shared:

  • GitHub stars - repositories that caught my attention
  • Pinboard bookmarks - saved links and articles
  • Mastodon posts - shared content and commentary
  • Are.na saves - collected ideas and inspiration

Processing Pipeline

  • AI summarizes each piece (OpenRouter, fallback to OpenAI)
  • Generates embeddings for semantic search
  • Makes it searchable via Alfred (鈱楽pace, type "sc [query]", instant results)

Architecture

Storage

  • Supabase for cloud storage and sync
  • SQLite mirror for instant local search
  • Docker deployment with health checks

Rate Limiting

Smart 6-level backoff system:

  • Respects API limits
  • Graceful degradation under load
  • Saves hundreds in API costs

The 16.7% Problem

Current status reveals an interesting tension:

  • 9,485 total scraps captured
  • Only 1,584 have AI summaries (16.7%)
  • Massive backlog due to early design decisions

Still incredibly useful even incomplete. The backlog is itself interesting data - a record of what accumulated faster than it could be processed.

What I Learned

Design Lessons

  • Perfect is enemy of done - should have processed as items came in
  • Local search > cloud search for daily use
  • Alfred integration is the killer feature
  • Smart rate limiting prevented runaway costs

Philosophical Lessons

Building tools to understand yourself, not impress others. Information accumulation is pointless without retrieval. Your digital exhaust is valuable if you capture it right.

Semantic search reveals patterns you didn't know existed.

Connection to Other Systems

Scrapbook-core is part of a larger Quantified Self approach:

  • Capture automatically (no friction)
  • Store long-term (years of data)
  • Analyze occasionally (when curious)
  • Reveal patterns (invisible in daily use)

Works alongside Personal APIs for unified access to personal data, and feeds context to AI systems.

Future Directions

  • Process the 8,000 scrap backlog
  • Build "theme weaver" to group related scraps
  • Mobile interface for on-the-go exploration
  • Integration with other personal systems


馃殌 Projects
Active Projects FPV Drones NOAA Satellites Website
Tools Scrapbook-core Exif-photo-printer Coach Artie Dataviz
Hardware Meshtastic HackRF Flipper Zero
Frameworks Timeline Viz LLM Eval Sensemaking Systems