QGI Logo QGI
Active Delivery · Integrations NVIDIA-AI-Blueprints/rag

Documents QAG

NVIDIA's RAG blueprint, remade deterministic

NVIDIA RAG fork: Q-Prime encoding + QAG Engine reasoning. Probabilistic retrieval → deterministic QAG.

The migration

Same pipeline surface. Deterministic core.

Re-spine the core — keep ingestion, API, and tooling. Swap encoding to Q-Prime and reasoning to QAG.

Classical NVIDIA RAG

Retrieve similar chunks → LLM

  1. 1.Chunk corpus. SBERT embeddings.
  2. 2.Top-k cosine similarity.
  3. 3.Stuff chunks into context.
  4. 4.Generate. Hope for the best.

Risk

Conflicts and gaps surface after generation — if at all.

Documents QAG

Q-Prime → HSC signals → QAG Engine

  1. 1.Q-Prime encodes polarity, scope, dependencies.
  2. 2.Relevance + Conflict, Overlap, Coverage signals.
  3. 3.QAG Engine evaluates against scope and time frame.
  4. 4.Generator receives named structure — responds deterministically.

Outcome

Signals surface before generation — visible to auditors.

What the fork adds

Six capabilities the base blueprint does not ship

Encoding

Q-Prime encoding

Replaces SBERT — polarity, scope, and dependencies survive encoding.

Reasoning

HSC signals

Seven signals replace cosine-similarity top-k retrieval.

Deterministic

Conflict before generation

Policy disagreements emit Conflict signals — not silent averaging.

Audit

Coverage gaps named

Missing overlays flagged as Coverage — route to curation, not hallucination.

Provenance

Replay-grade provenance

Source, version, validity window on every reasoning input.

Migration

Drop-in migration

Swap encoding and reasoning; keep NVIDIA ingestion and API surface.

Frequently asked

Questions teams ask before they migrate from NVIDIA RAG.

What is Documents QAG?
Enterprise Factory fork of NVIDIA RAG. Q-Prime encoding + QAG Engine reasoning — deterministic upgrade for regulated retrieval.
How is this different from classical NVIDIA RAG?
Classical: SBERT chunks + cosine top-k. Documents QAG: Q-Prime structure + seven HSC signals before generation.
Full rewrite or migration?
Migration. Swap encoding and reasoning; keep ingestion, API, and observability.
Which industries?
Financial services live (mortgage, credit, AML). Insurance and healthcare on roadmap.
How do I evaluate?
Clone the repo or contact us for a scoped demo against your NVIDIA RAG baseline.

Already running the NVIDIA RAG blueprint?

Migrate encoding and reasoning in place. Keep your ingestion and observability.

Partner with QGI