$8 Edge AI: running YOLOv8 on a 0.5-TOPS Rockchip RV1106 NPU

Quantization-aware training matters more than post-training quant. INT8 RKNN cuts size 3× with <2% accuracy loss. Offline-first saves the day when 4G drops.

Author: Nhật Anh·Published: Apr 22, 2026·2 min readAIoT Edge AI YOLOv8 Rockchip Embedded

Why RV1106

Deployed a farm-monitoring station 80km from town with 4G dropping 3-4× a day. Cloud-based goes blind during outages. Owners need pest detection within minutes.

Rockchip RV1106 = $8 chip, 0.5-TOPS NPU, 256MB RAM. Enough for YOLOv8-nano if compressed properly.

Optimal pipeline

Train YOLOv8n on an 18k-image dataset of 7 pest classes (FP32, GPU)
Quantization-aware training for the last 50 epochs with real calibration data from station cameras
Export ONNX → convert to RKNN INT8 (rknn-toolkit2)
Test on real hardware — not just simulator

Compression results

	Size	Acc [email protected]	FPS
FP32 PT	6.1 MB	0.882	1.8
FP16 ONNX	3.2 MB	0.879	4.1
INT8 PTQ	1.8 MB	0.812	12
INT8 QAT	2.1 MB	0.864	12

Share:X / Twitter Facebook LinkedIn Telegram

$8 Edge AI: running YOLOv8 on a 0.5-TOPS Rockchip RV1106 NPU

Why RV1106

Optimal pipeline

Compression results

Offline-first architecture

Lessons