ArchiveBox: Difference between revisions
Appearance
Created page with "{{AI-Generated|Claude generated the initial structure and examples for this documentation|tool=Claude}} = ArchiveBox Installation = '''ArchiveBox''' is a self-hosted web archive service that preserves websites, articles, and online content for offline viewing and long-term storage. Our instance runs at [https://snap.ejfox.com snap.ejfox.com]. == Overview == ArchiveBox creates multiple backup formats for each archived URL: * HTML snapshots * PDF captures * Screensho..." |
Add Navbox Journalism |
||
| (2 intermediate revisions by 2 users not shown) | |||
| Line 15: | Line 15: | ||
* Archive.org submissions | * Archive.org submissions | ||
=== Archive a URL === | === Archive a URL === | ||
| Line 171: | Line 134: | ||
[[Category:Documentation]] | [[Category:Documentation]] | ||
[[Category:Technical]] | [[Category:Technical]] | ||
{{Navbox Technical}} | |||
{{Navbox Journalism}} | |||
Latest revision as of 20:22, 7 December 2025
🤖 AI-Generated Content
Claude generated the initial structure and examples for this documentation
Claude
ArchiveBox Installation
ArchiveBox is a self-hosted web archive service that preserves websites, articles, and online content for offline viewing and long-term storage. Our instance runs at snap.ejfox.com.
Overview
ArchiveBox creates multiple backup formats for each archived URL:
- HTML snapshots
- PDF captures
- Screenshot images
- Video recordings
- Raw HTML/CSS/JS files
- Archive.org submissions
Archive a URL
curl -X POST https://snap.ejfox.com/api/v1/core/snapshot/ \
-H "Authorization: Token your_api_token_here" \
-H "Content-Type: application/json" \
-d '{"url": "https://ejfox.com", "tags": "personal,website"}'
Check if URL is Archived
# Method 1: API query
curl -s "https://snap.ejfox.com/api/v1/core/snapshot/?url=https://ejfox.com" \
-H "Authorization: Token your_api_token" | jq '.count > 0'
# Method 2: Simple HTTP check
curl -s -o /dev/null -w "%{http_code}" \
"https://snap.ejfox.com/archive/https://ejfox.com"
# Returns: 200 if archived, 404 if not
Search Archives
# Search by URL curl "https://snap.ejfox.com/api/v1/core/snapshot/?search=ejfox.com" \ -H "Authorization: Token your_api_token" # Search by tags curl "https://snap.ejfox.com/api/v1/core/snapshot/?tag=dataviz" \ -H "Authorization: Token your_api_token"
List Recent Archives
curl "https://snap.ejfox.com/api/v1/core/snapshot/?limit=10&ordering=-created" \
-H "Authorization: Token your_api_token" | jq '.results[] | {url, created, tags}'
Browser Integration
Bookmarklet
Create a bookmark with this JavaScript for one-click archiving:
javascript:(function(){
const url = window.location.href;
const title = document.title;
fetch('https://snap.ejfox.com/api/v1/core/snapshot/', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Token your_api_token_here'
},
body: JSON.stringify({
url: url,
tags: 'manual,bookmarklet',
title: title
})
})
.then(resp => resp.json())
.then(data => {
alert(`Archived! View at: ${data.archive_url}`);
})
.catch(err => alert('Archive failed: ' + err));
})();
MediaWiki Integration
Archive Template
Create Template:Archive:
[https://snap.ejfox.com/search?q={{{1}}} 📸 Search Archives] | [https://snap.ejfox.com/add?url={{{1}}} ➕ Add to Archive]
Usage in articles:
Check out this article {{Archive|https://example.com}} about data visualization.
Maintenance
Health Check
# Check if service is running curl -s https://snap.ejfox.com/health/ || echo "ArchiveBox down!" # Check recent archives count curl -s "https://snap.ejfox.com/api/v1/core/snapshot/?limit=1" \ -H "Authorization: Token $API_TOKEN" | jq '.count'
Backup Archives
# Backup ArchiveBox data tar -czf archivebox-backup-$(date +%Y%m%d).tar.gz ~/archivebox-data/ # Sync to remote storage rsync -av ~/archivebox-data/ user@backup-server:/backups/archivebox/
Related
- Personal APIs - Integration with other self-hosted services
- Template:Quote - Reference archived sources in quotes
- Data Sovereignty - Philosophy behind self-hosted archiving
| ⚡ Technical | |
|---|---|
| Core | Technical · CLI · Dotfiles · Nvim · SSH · VPS |
| Tools | Sketchybar · ArchiveBox · ThinkPad Linux |
| Systems | Automation · Personal APIs · Quantified Self |
| Reference | Runbooks · New Computer Runbook · Syntax guide |
| Journalism & Investigations | |
|---|---|
| Core | Journalism · Investigations · Source Handling |
| Methods | FOIA · Data Journalism · Dataviz · Documentation Discipline |
| Tools | ArchiveBox · Scrapbook-core · Personal APIs |
| Culture | Hacker Culture · PGP Communication Guide |