Naver Labs Europe has unveiled DIVINE, a universal encoder designed to consolidate multiple specialized AI perception encoders into a single model, enabling high-performance autonomous robot operation on compact hardware without large dedicated computing resources. The announcement was made on June 23 by Naver Labs, the research and development arm of South Korean technology company Naver.
Autonomous robots typically depend on multiple separate encoders to process sensor data from cameras and LiDAR into formats that AI models can use for tasks including image understanding, spatial mapping, and human recognition. Each encoder processes the same incoming data independently, creating significant computational redundancy and memory overhead that limits performance on the edge hardware most deployed robots carry.
How DIVINE Works
DIVINE addresses this through a multi-teacher distillation architecture. Rather than training a single large model from scratch, the approach takes multiple specialized “teacher” models – each an expert in a particular perceptual domain such as image analysis, spatial understanding, or human recognition – and distills their combined knowledge into a single “student” model. The resulting encoder handles the full range of visual AI functions without requiring the specialist models to run simultaneously.
The performance gains from eliminating redundant processing are substantial. Compared to systems using multiple encoders, DIVINE reduces encoder memory consumption by approximately 90% and improves encoding throughput by up to 12 times. Overall robot memory usage declines by roughly 62%, and total system processing speed increases by up to fourfold.
Why Efficiency Matters for Deployed Robots
The efficiency gains are not primarily a laboratory metric – they address a practical constraint that limits what autonomous robots can do in the field. Edge compute hardware on compact robots is expensive, power-constrained, and physically limited in size. A robot that needs to run multiple large specialized models simultaneously either requires a large, costly compute module or must compromise on perception capability. DIVINE’s consolidation enables robots that would otherwise require expensive hardware to run sophisticated multi-domain perception within the resource envelope of compact platforms.
Naver emphasized that rapid perception and response are particularly important in environments where humans and robots coexist – precisely the settings where humanoid and service robots are being deployed at increasing scale. The ability to recognize and react to human presence, obstacles, and environmental changes quickly, on limited compute resources, is a prerequisite for safe human-robot collaboration in warehouses, factories, and public spaces.
Upgradability Through Software
The architecture is also designed for forward compatibility. Rather than requiring new robot hardware when AI models improve, operators can introduce new perceptual capabilities through software updates to the distillation process – replacing or adding teacher models without changing the physical robot. This reduces the long-term cost of keeping deployed fleets current with AI improvements, a significant consideration for enterprise customers managing large robot fleets over multi-year deployment cycles.