Abstract
We introduce a novel dataset and evaluation approach for long-range depth prediction of small objects that enables consistent comparison across direct time-of-flight (ToF) sensors and learned depth estimation methods. In autonomous driving, accurate depth perception is essential for identifying and locating surrounding elements and determining safe driving paths. Traditional depth metrics focus on distance accuracy but fail to evaluate a key factor at long ranges: distinguishing small, slightly elevated structures from the ground — crucial for anticipating obstacles and making safe driving decisions. At far distances, image-based systems suffer from resolution limitations that tend to oversmooth the ground plane, causing elevated objects to be mistaken as texture patterns on the surface. Conversely, scanning LiDAR systems may return only a single point from an elevated object due to steep incident angles and sparse returns, preventing accurate differentiation from the ground. This hampers a fair comparison of object presence and shape. To address this, we propose a framework that evaluates how well the estimated point clouds preserve semantic content relative to ground-truth data. We leverage graph neural network-based feature extraction to assess structural similarity, enabling a modality-agnostic evaluation of object-level fidelity. Our method also supports analysis of the trade-off between resolution and accuracy, investigating performances across sensor types — such as high-resolution cameras versus LiDAR — and conditions, including day and night scenarios. This enables a more comprehensive understanding of the capabilities and limitations of current depth prediction approaches in real-world settings.
| Original language | English |
|---|---|
| Publication status | Accepted/In press - 5 Nov 2025 |
| MoE publication type | Not Eligible |
| Event | International Conference on 3D Vision 2026 - Vancouver, Canada Duration: 20 Mar 2026 → 23 Mar 2026 |
Conference
| Conference | International Conference on 3D Vision 2026 |
|---|---|
| Country/Territory | Canada |
| City | Vancouver |
| Period | 20/03/26 → 23/03/26 |
Keywords
- Depth evaluation
- Autonomous driving
- Depth estimation
- Multi-modal sensing