Customer Complaint Classifier
Phân loại Khiếu nại Khách hàng
Auto-route 8,000 monthly tickets into 18 branches and correct departments — F1 0.91 on Vietnamese validation set.
Problem
A support center received 8,000 tickets/month, manually triaged into 18 business branches. Mis-routing caused SLA breaches and customers being bounced between teams. The 4-person triage team couldn't keep up at peak hours.
Architecture
Ticketing webhook → FastAPI inference → fine-tuned PhoBERT-base → confidence gate → rule fallback (regex for rare labels) → assignee push-back. Mislabels surface to a review queue rather than silently routing.
Stack & rationale
- PhoBERT-base (Vietnamese RoBERTa): outperforms mBERT/XLM-R on pure-Vietnamese tickets.
- 24k anonymized ticket dataset, augmented via EN↔VI back-translation.
- Confidence threshold 0.78: below → human review (preserves recall).
Results
| Metric | Before | After |
|---|---|---|
| Macro F1 (18 classes) | — | 0.91 |
| Auto-route hit rate | 0% | 87% |
| Triage headcount | 4 | 1.5 |
| Avg time to assign | 22 min | 2 min |
Lessons
Domain fine-tuning beats LLM zero-shot for closed-set classification. A mislabel-review queue lets the model improve continuously.