Internal Document Q&A Engine
Hệ thống Hỏi-Đáp Tài liệu Nội bộ
Private RAG: find the right passage across 50,000+ pages of SOPs, manuals, contracts — response <2s.
Problem
Legal & customer-support teams searched 50,000+ pages of internal docs scattered across NAS folders. Average 12 min per lookup, with frequent mistakes from reading outdated versions. Support SLA was hit.
Architecture
PDF/DOCX → unstructured parser → 200-token overlapping chunks → bge-m3 embeddings → Qdrant → BGE Reranker top-20→top-3 → 4o-mini LLM with cite-source system prompt. Next.js 15 UI with streaming responses and direct deep-links to source passages.
Stack & rationale
- bge-m3 (multilingual): better Vietnamese legal-domain recall than ada-002.
- Self-hosted Qdrant: PII control, no data leaves premises.
- BGE Reranker: precision 0.71 → 0.89 on a 200-question eval set.
Results
| Metric | Before | After |
|---|---|---|
| Avg lookup time | 12 min | 40 sec |
| Right-version answer rate | 64% | 96% |
| Support tickets about errors | — | −43% |
Lessons
A reranker matters more than a "bigger" embedding model. Heading-based chunking beats fixed-size for legal docs.