How PaperMind works, under the hood.
A walk-through of the local AI pipeline, the cryptographic vault, the Solana attestation layer, and the tradeoffs we made building it.
Last updated · v0.2.1 · May 11, 2026
What is PaperMind?
PaperMind is a private AI app for the documents you can't put in a cloud chatbot — contracts, prescriptions, wills, insurance policies, foreign-language certificates, financial disclosures. It reads them, translates them, explains them in plain language, and reads the result aloud. Everything happens on your own computer.
It also lets you do something most apps can't: leave a cryptographic Legacy Vault behind — a sealed bundle of documents and personal letters that only the people you choose can open, together, in the future.
Explain
Drop in any PDF or image. PaperMind extracts the text and rewrites it as a short, plain-language summary.
Translate
Six languages, bidirectional, offline. Sensitive foreign documents never touch a cloud server.
Preserve
Seal documents into an encrypted vault that opens only when a chosen quorum of loved ones come together.
Built for people who care that “private” means private: no account, no upload, no telemetry, no fine print. The whole application is open source under Apache 2.0 — every claim on this page is verifiable in the codebase.
TL;DR for engineers
PaperMind is an Electron desktop app. Every model — OCR, neural machine translation, the LLM, and TTS — runs locally through the Tether QVAC SDK. Documents are processed in-memory; outputs are written only to a workspace folder you choose. An optional Legacy Vault encrypts a document bundle with AES-256-GCM, splits the key via Shamir's Secret Sharing, and anchors a content hash to the Solana devnet for tamper-evidence.
Architecture at a glance
The app is split across the standard Electron processes: a Node main process owns model loading and pipeline orchestration; a React + Tailwind renderer drives the UI; and a contextBridge preload script narrows the IPC surface. There is no PaperMind server, no remote inference, and no background telemetry.
qvac.tsocr.tstranslate.tsexplain.tsliving-letter.tsspeak.tsvault.tssolana.tsai:*vault:*solana:*pdfToImages.tsHomeResultVaultUnlockThe document pipeline
Every document — PNG, JPG, WebP, or multi-page PDF — flows through the same four stages. PDFs are rasterized at 2× scale on the renderer canvas before reaching the OCR step.
OCR
QVAC ocr() with the OCR Latin Recognizer 1 model. Pages are processed individually and concatenated. Output is raw text with paragraph breaks preserved.
Translation
QVAC translate() using Bergamot bidirectional pairs across EN/FR/ES/DE/IT/PT. Skipped when source equals target. Runs in a Web Worker — never on the main thread.
Explanation
QVAC completion() against Llama 3.2 1B Q4. The system prompt constrains the model to plain spoken language in the target locale, ≤5 sentences, no markdown.
Voice
QVAC textToSpeech() using Supertonic2 in multilingual mode with English fallback. Output is WAV-wrapped 44.1 kHz mono ready for the in-app player.
Each stage degrades gracefully. OCR errors surface a human-readable message; TTS failure is non-fatal and the text result still renders; translation is skipped when the language pair is unavailable rather than blocking the rest of the pipeline.
The Legacy Vault
The Vault is an opt-in cryptographic container that bundles one or more processed documents into a single portable file. Sealing is deliberately multi-step.
- Living Letter generation. A second LLM pass produces a first-person, ≤180-word letter per document. The system prompt is selected by document type — will, insurance, medication, property, financial, emergency, other — so the voice stays consistent with what the document is.
- TTS rendering. Each Living Letter is synthesized and embedded as base64 audio inside the vault bundle. Audio is optional — large vaults can opt out.
- Encryption. The bundle (JSON) is encrypted with AES-256-GCM using a freshly generated 32-byte key and 12-byte IV. The auth tag is stored alongside the ciphertext.
- Key splitting. The 32-byte AES key is fed into Shamir's Secret Sharing with a configurable K-of-N threshold (default 2-of-3). Each share is presented as a hex string in 4-character groups and as a QR code.
- Distribution. The owner exports the encrypted vault file and physically delivers shares to beneficiaries. PaperMind never sees any of it.
Unlock is the reverse: import the vault file, two or more beneficiaries enter their shares, the AES key is reconstructed via shamirs-secret-sharing's combine(), the bundle is decrypted, and the Living Letters fade in one card at a time.
Solana attestation (optional)
Vaults can be anchored to the Solana devnet for tamper-evidence. We use the Memo program (MemoSq4gqABAXKb96qnH8TysNcWxMyWCqXgDLGmfcHr) rather than a custom Anchor program — it's a no-op that records arbitrary bytes in a transaction, which is exactly the attestation surface we need.
A locally-generated Ed25519 keypair lives in your workspace and is auto-funded by the devnet faucet when its balance drops below 0.001 SOL. Three message types are emitted:
vault_sealRecorded at seal time. Contains the vault ID, threshold metadata, and a SHA-256 fingerprint of the encrypted bundle.vault_checkinAn “I'm alive” ping. Resets the dead-man's-switch clock for beneficiaries who are watching.vault_activeRecorded when a beneficiary group reaches threshold and unlocks the vault.
Memos contain only hashes, IDs, and counts. No document content, no personal data, no key material ever touches the chain. Solana failures never block sealing or unlocking; on-chain attestation is best-effort by design.
Security model
| Domain | Guarantee |
|---|---|
| Inference | Zero outbound network calls during document processing. Verifiable with any OS-level network monitor. |
| On-chain privacy | Solana payloads carry only hashes, vault IDs, and threshold counts. |
| Symmetric crypto | AES-256-GCM via Node's built-in crypto. Random key + IV per seal. |
| Secret splitting | Shamir's Secret Sharing over GF(2⁸) with 128-bit padding. |
| Auditability | Seal emits a SHA-256 fingerprint that matches the Solana Memo payload. |
| Determinism | Unlock is deterministic given correct shares. LLM outputs are non-deterministic (temp 0.4) — accepted product behavior. |
Stack & key dependencies
Build from source
PaperMind is Apache 2.0. You can verify everything on this page by cloning, auditing, and running the code yourself.
# Clone git clone https://github.com/Temitope15/papermind.git cd papermind # Install npm install # Run in dev (Electron + Vite HMR) npm run dev # Package a distributable binary npm run build npm run package
On first launch the app will prompt you to download the model weights from their original public hosts (Hugging Face, GitHub Releases). Total cold install footprint is under 1 GB.
Glossary
- QVAC
- Quantum Vector AI Compute — Tether's local-first AI SDK shipping LLM, OCR, NMT, and TTS in a single API surface.
- Living Letter
- A warm first-person LLM output written as if the document owner is speaking, generated at vault-seal time.
- Shamir SSS
- A cryptographic scheme that splits a secret into N shares such that any K reconstruct it; fewer than K reveal nothing.
- AES-256-GCM
- Authenticated symmetric encryption used to encrypt the vault bundle with integrity protection.
- Memo program
- A no-op Solana program that records arbitrary data in a transaction — perfect for attestation.
- Dead Man's Switch
- A periodic check-in mechanism that signals beneficiaries when the owner has stopped responding for a configurable period.