Your Documents. Your Machine. Your Answers.

Coherence is a local-first RAG pipeline that maximises retrieval quality through intelligent multi-stage filtering, while ensuring your documents never leave your machine.

100%
Local Processing
2
RAG Modes
Filter Stages
0
Data Leaves Device
// How It Works

A Multi-Stage Intelligence Funnel

Coherence progressively narrows your document corpus through three filtering stages, ensuring only the most relevant content reaches the generation model.

01 Ingestion

PDF → Structured Text

Raw PDFs are parsed and converted into clean markdown. Your query passes through a Safeguard check and a Parser LLM to produce a refined, cleaned query.

02 Filtering

The Document Funnel

A two-stage filter: the Doc Filter discards irrelevant files using metadata thresholds. Then the Section Filter applies cosine similarity with configurable window size to isolate relevant sections.

03 Generation

Answer or Analyse

A switch routes to either Retrieval mode (rerank → generate an answer) or Coherence mode (extract propositions → validate facts against the query).

// Capabilities

Engineered for Precision

Dual Mode

Retrieval & Coherence Modes

Switch between two distinct operational modes. Retrieval generates direct answers from your documents. Coherence decomposes text into atomic propositions and validates them against your query, producing an affirm/negate analysis.

RETRIEVAL
Reranker → Candidate Sections → LLM Generation → Answer
COHERENCE
Propositions → Corelevance ×2 → A/N List
Privacy

Zero Network Exposure

Every stage of the pipeline runs entirely on your hardware. No API calls, no cloud sync, no telemetry. Your confidential documents stay where they belong.

✓ AIRGAP COMPATIBLE
Query

Intelligent Query Refinement

Raw queries pass through a safeguard layer and a Parser LLM that structures, cleans, and optimises them before they hit the retrieval engine.

Filter

Adaptive Document Funnel

Automatically skips document-level filtering for single-doc queries. For multi-doc corpora, it applies metadata-based thresholds before section-level cosine similarity.

Reranker

Cross-Encoder Reranking

Candidate sections are re‑scored with a cross‑encoder model, pushing the most semantically relevant passages to the top before generation.

// Architecture

System Blueprint

An interactive overview of Coherence's dual-mode RAG pipeline — from document ingestion to final output.

PDFs source documents Query safeguard → parser LLM Doc Filter metadata · threshold Cleaned Query Section Filter cosine similarity · window_size · threshold → relevant sections SWITCH retrieval coherence Reranker → Generation cross-encoder · LLM + prompt Answer Propositions → Corelevance extraction · 2-stage threshold A/N List
COHERENCE PATH
RETRIEVAL PATH
// Security Model

Document Secrecy by Design

Coherence was built from the ground up with a single security axiom: your data never leaves your machine.

◈ LOCAL INFERENCE

On‑Device LLM Execution

All language model inference — from query parsing to answer generation — runs locally. No tokens are sent to external APIs, eliminating data exfiltration vectors.

◈ ZERO TELEMETRY

No Tracking, No Logging

Coherence ships with zero analytics, zero usage tracking, and zero crash reporting that might leak document metadata or content fragments.

◈ QUERY SAFEGUARD

Built-in Input Sanitisation

Every query passes through a dedicated safeguard layer that screens for prompt injection, sensitive content leakage, and malformed requests before processing.

◈ AIRGAP READY

Works Without Internet

Once models are downloaded, Coherence operates entirely offline. Deploy it in air-gapped environments for the most sensitive document processing workflows.

// Get Started

Ready to own
your RAG pipeline?

Install Coherence and start querying your documents locally. No accounts, no API keys, no cloud dependencies.

$ cargo install coherence