Coherence is a local-first RAG pipeline that maximises retrieval quality through intelligent multi-stage filtering, while ensuring your documents never leave your machine.
Coherence progressively narrows your document corpus through three filtering stages, ensuring only the most relevant content reaches the generation model.
Raw PDFs are parsed and converted into clean markdown. Your query passes through a Safeguard check and a Parser LLM to produce a refined, cleaned query.
A two-stage filter: the Doc Filter discards irrelevant files using metadata thresholds. Then the Section Filter applies cosine similarity with configurable window size to isolate relevant sections.
A switch routes to either Retrieval mode (rerank → generate an answer) or Coherence mode (extract propositions → validate facts against the query).
Switch between two distinct operational modes. Retrieval generates direct answers from your documents. Coherence decomposes text into atomic propositions and validates them against your query, producing an affirm/negate analysis.
Every stage of the pipeline runs entirely on your hardware. No API calls, no cloud sync, no telemetry. Your confidential documents stay where they belong.
Raw queries pass through a safeguard layer and a Parser LLM that structures, cleans, and optimises them before they hit the retrieval engine.
Automatically skips document-level filtering for single-doc queries. For multi-doc corpora, it applies metadata-based thresholds before section-level cosine similarity.
Candidate sections are re‑scored with a cross‑encoder model, pushing the most semantically relevant passages to the top before generation.
An interactive overview of Coherence's dual-mode RAG pipeline — from document ingestion to final output.
Coherence was built from the ground up with a single security axiom: your data never leaves your machine.
All language model inference — from query parsing to answer generation — runs locally. No tokens are sent to external APIs, eliminating data exfiltration vectors.
Coherence ships with zero analytics, zero usage tracking, and zero crash reporting that might leak document metadata or content fragments.
Every query passes through a dedicated safeguard layer that screens for prompt injection, sensitive content leakage, and malformed requests before processing.
Once models are downloaded, Coherence operates entirely offline. Deploy it in air-gapped environments for the most sensitive document processing workflows.
Install Coherence and start querying your documents locally. No accounts, no API keys, no cloud dependencies.
$ cargo install coherence