Scrapbook-core: Difference between revisions
Added Projects navigation |
Expanded with 16.7% problem, architecture, philosophy |
||
| (One intermediate revision by the same user not shown) | |||
| Line 1: | Line 1: | ||
= Scrapbook-core = | |||
'''A searchable archive of everything read, starred, and shared online - and why 9,485 scraps is both too much and not enough.''' | |||
== The Problem == | |||
The frustration that drove this project: | |||
* Losing articles read 6 months ago | |||
* Can't remember where "that thing about X" came from | |||
* Bookmarks accumulate but never get searched | |||
* Browser history becomes useless after a certain point | |||
''If I star something, bookmark something, or post about it - it mattered to me. It should be as easy to search my own history as it is to Google something new.'' | |||
== What It Actually Does == | |||
=== Data Sources === | |||
Pulls from multiple platforms where content gets saved or shared: | |||
* '''GitHub stars''' - repositories that caught my attention | |||
* '''Pinboard bookmarks''' - saved links and articles | |||
* '''Mastodon posts''' - shared content and commentary | |||
* '''Are.na saves''' - collected ideas and inspiration | |||
=== Processing Pipeline === | |||
* AI summarizes each piece (OpenRouter, fallback to OpenAI) | |||
* Generates embeddings for semantic search | |||
* Makes it searchable via Alfred (⌘Space, type "sc [query]", instant results) | |||
== Architecture == | |||
=== Storage === | |||
* '''Supabase''' for cloud storage and sync | |||
* '''SQLite mirror''' for instant local search | |||
* '''Docker deployment''' with health checks | |||
=== Rate Limiting === | |||
Smart 6-level backoff system: | |||
* Respects API limits | |||
* Graceful degradation under load | |||
* Saves hundreds in API costs | |||
== The 16.7% Problem == | |||
Current status reveals an interesting tension: | |||
* '''9,485 total scraps''' captured | |||
* '''Only 1,584 have AI summaries''' (16.7%) | |||
* Massive backlog due to early design decisions | |||
''Still incredibly useful even incomplete. The backlog is itself interesting data - a record of what accumulated faster than it could be processed.'' | |||
== What I Learned == | |||
=== Design Lessons === | |||
* '''Perfect is enemy of done''' - should have processed as items came in | |||
* '''Local search > cloud search''' for daily use | |||
* '''Alfred integration''' is the killer feature | |||
* '''Smart rate limiting''' prevented runaway costs | |||
=== Philosophical Lessons === | |||
Building tools to understand yourself, not impress others. Information accumulation is pointless without retrieval. Your digital exhaust is valuable if you capture it right. | |||
''Semantic search reveals patterns you didn't know existed.'' | |||
== Connection to Other Systems == | |||
Scrapbook-core is part of a larger [[Quantified Self]] approach: | |||
* Capture automatically (no friction) | |||
* Store long-term (years of data) | |||
* Analyze occasionally (when curious) | |||
* Reveal patterns (invisible in daily use) | |||
Works alongside [[Personal APIs]] for unified access to personal data, and feeds context to AI systems. | |||
== Future Directions == | |||
* Process the 8,000 scrap backlog | |||
* Build "theme weaver" to group related scraps | |||
* Mobile interface for on-the-go exploration | |||
* Integration with other personal systems | |||
[[Category:Projects]] | [[Category:Projects]] | ||
[[Category:Personal Data]] | |||
[[Category:Knowledge Management]] | |||
{{Navbox Projects}} | {{Navbox Projects}} | ||
Latest revision as of 14:30, 18 January 2026
Scrapbook-core
A searchable archive of everything read, starred, and shared online - and why 9,485 scraps is both too much and not enough.
The Problem
The frustration that drove this project:
- Losing articles read 6 months ago
- Can't remember where "that thing about X" came from
- Bookmarks accumulate but never get searched
- Browser history becomes useless after a certain point
If I star something, bookmark something, or post about it - it mattered to me. It should be as easy to search my own history as it is to Google something new.
What It Actually Does
Data Sources
Pulls from multiple platforms where content gets saved or shared:
- GitHub stars - repositories that caught my attention
- Pinboard bookmarks - saved links and articles
- Mastodon posts - shared content and commentary
- Are.na saves - collected ideas and inspiration
Processing Pipeline
- AI summarizes each piece (OpenRouter, fallback to OpenAI)
- Generates embeddings for semantic search
- Makes it searchable via Alfred (⌘Space, type "sc [query]", instant results)
Architecture
Storage
- Supabase for cloud storage and sync
- SQLite mirror for instant local search
- Docker deployment with health checks
Rate Limiting
Smart 6-level backoff system:
- Respects API limits
- Graceful degradation under load
- Saves hundreds in API costs
The 16.7% Problem
Current status reveals an interesting tension:
- 9,485 total scraps captured
- Only 1,584 have AI summaries (16.7%)
- Massive backlog due to early design decisions
Still incredibly useful even incomplete. The backlog is itself interesting data - a record of what accumulated faster than it could be processed.
What I Learned
Design Lessons
- Perfect is enemy of done - should have processed as items came in
- Local search > cloud search for daily use
- Alfred integration is the killer feature
- Smart rate limiting prevented runaway costs
Philosophical Lessons
Building tools to understand yourself, not impress others. Information accumulation is pointless without retrieval. Your digital exhaust is valuable if you capture it right.
Semantic search reveals patterns you didn't know existed.
Connection to Other Systems
Scrapbook-core is part of a larger Quantified Self approach:
- Capture automatically (no friction)
- Store long-term (years of data)
- Analyze occasionally (when curious)
- Reveal patterns (invisible in daily use)
Works alongside Personal APIs for unified access to personal data, and feeds context to AI systems.
Future Directions
- Process the 8,000 scrap backlog
- Build "theme weaver" to group related scraps
- Mobile interface for on-the-go exploration
- Integration with other personal systems
| 🚀 Projects | |
|---|---|
| Active | Projects · FPV Drones · NOAA Satellites · Website |
| Tools | Scrapbook-core · Exif-photo-printer · Coach Artie · Dataviz |
| Hardware | Meshtastic · HackRF · Flipper Zero |
| Frameworks | Timeline Viz · LLM Eval · Sensemaking Systems |