project
clipship
A self-hosted web clipper that drops clean Markdown into a folder you own.
GitHub repo:
clipship
Last update: 2026/06/11
Description
Clipship is a browser extension + tiny Flask server
for capturing the readable content of any web page (or any PDF) as a
Markdown file in a folder on your own machine. The extension runs
Mozilla’s Readability on the page, converts the result to Markdown with
YAML frontmatter, signs the payload with HMAC-SHA256, and POSTs it to
your endpoint. The server verifies the signature, downloads every
referenced image into an assets/ folder, and writes a
single .md file ready to be opened by Obsidian, grep, or
anything else that reads plain text.
Features
- Clean Markdown clips: Readability strips the
chrome; a small HTML→Markdown converter handles the body; YAML
frontmatter carries
title,source,author,site,date, andtags. - Self-contained: every image referenced in a clip is
downloaded into
assets/(with SSRF guards, size caps, and timeouts), so notes survive going offline or moving folders. No broken images later. - PDF clipping: click the extension on a
.pdftab and the server stores the file alongside the Markdown. Pick the extractor that fits —pypdf(fast, pure Python) oropendataloader-pdf(Java-backed, produces structured Markdown with tables and headings preserved). - Tags that match your vault: tags are emitted as block-style YAML, the same form Obsidian writes in its Properties panel, so hand-edited notes and clipped notes are visually identical.
- Optional web UI: a read-only browser for the inbox with full-text search, tag filtering, and rendered Markdown. HTTP-Basic-auth-gated, off by default, reverse-proxy-friendly.
- Signed payloads: HMAC-SHA256 over
timestamp + body, ±5 minute replay window, constant-time signature comparison server-side. - Interactive setup: one command
(
python3 clipship_setup.py) walks you through local-vs-remote install, picks the PDF backend, generates the secret, writesconfig.py, and installs only the dependencies your configuration needs. - No third parties: the extension talks only to your endpoint; the server only fetches assets you’ve referenced. No analytics, no telemetry, no account.
Technologies
- JavaScript (MV3): Chrome- and Firefox-compatible extension, vendored Mozilla Readability, custom HTML→Markdown converter.
- Python + Flask: minimal signed-payload receiver, optional read-only web UI, no database — just files.
- HMAC-SHA256 + ProxyFix: signature verification on every request, correct URL generation behind nginx/Caddy.
- pypdf / opendataloader-pdf: pluggable PDF text extraction.
- systemd: units for both receiver and web UI included.
Why It Matters
I built Clipship because every “save this article” service eventually decides it owns your reading list — paywalls, ads, sync limits, sunset emails. I wanted the same one-click capture, but the file lands in a folder I own, that I can grep, back up, drop into Obsidian, or move between machines without anyone’s permission. Clipship is the smallest thing that does that: one extension, one Flask file, one folder, one shared secret. No accounts, no cloud, no vendor — and because it’s open source, you can audit exactly what leaves your browser.