Trust, but verify.

Epistemic uncertainty detection for LLMs

pip install kateryna

Three states. One breakthrough.

+1

Grounded

Confident with evidence. RAG retrieval supports the response.

0

Uncertain

Gives your LLM permission to say "I don't know" instead of guessing.

-1

Ungrounded

Confident without evidence. Hallucination danger zone.

Most uncertainty systems only distinguish "confident" from "not confident." Kateryna adds the critical third state: confident without evidence. When your RAG returns nothing but your LLM sounds certain, that's the danger zone. That's what we catch.

Built for RAG pipelines

Kateryna validates LLM confidence against retrieval evidence. No RAG, no baseline. With RAG, we catch the lies.

1

User Query

"What is the Blueridge Protocol?"

2

RAG Search

Knowledge base returns 0 relevant chunks

3

LLM Response

"The Blueridge Protocol is an innovative framework designed to enhance..."

!

Kateryna

-1 UNGROUNDED

Confident without evidence

Weak RAG + Confident LLM = Hallucination caught

Simple integration

example.py

Real results

We tested an LLM with 7 hallucination-prone queries. It confidently fabricated answers for 5 of them.

Query LLM Said Reality
"What is the Blueridge Protocol?" "An innovative framework designed to enhance..." Doesn't exist
"Explain pandas.smart_merge()" "An internal utility within pandas..." Doesn't exist
"Summarize Smith et al. (2023)" "The paper revealed significant insights..." Fabricated citation

With RAG context, Kateryna flagged all 5 as -1 UNGROUNDED. Confident language, zero evidence.

78%
Detection accuracy in testing
Strong RAG, weak RAG, and no-RAG scenarios. Honest number, improving with each release.

What's included

Ternary State Detection Open Source

The core -1/0/+1 epistemic classification. Detect grounded, uncertain, and ungrounded responses.

LLM Adapters Open Source

Built-in support for OpenAI, Anthropic, and Ollama. Drop-in integration with your existing stack.

RAG Confidence Scoring Open Source

Calculate retrieval confidence based on chunk relevance and coverage. Works with any vector store.

Linguistic Analysis Open Source

Detect hedging language, uncertainty markers, and confidence patterns in LLM outputs.

Audit Logging Pro

Compliance-grade logging of every epistemic assessment. Tamper-proof storage for regulated industries.

Analytics Dashboard Pro

Visualize hallucination rates over time. Track which queries trigger ungrounded responses.

Domain Packs Pro

Fine-tuned detection for legal, medical, financial, and trade compliance domains.

Threshold Calibration Pro

Custom threshold tuning for your specific accuracy/coverage tradeoffs. Per-client calibration service.

Contradiction Detection Pro

Detect when RAG chunks conflict with each other. Flag uncertainty when sources disagree.

Documentation Audit Pro

Scan your corpus before production. Find gaps, contradictions, and ambiguity where LLMs will hallucinate.

Kateryna Pro

Coming Soon

For teams that need compliance, visibility, and domain-specific accuracy.

SOC2 compliant logging Priority support Custom integrations SLA guarantees

Be first to know when Pro launches. No spam.

Need enterprise features now? Get in touch

Built on ternary logic principles from the Setun computer (1958). Named after Kateryna Yushchenko, pioneer of address programming.

Copied to clipboard