Uncertainty-Aware Perception for Reinforcement Learning Agents

This work investigates post-hoc calibration as a means of providing reinforcement learning (RL) agents with lightweight uncertainty estimation that can shape decision-making behavior. Calibration techniques were applied to a YOLO11n object detection model. While global calibration techniques showed limited improvement in a 3D grid world, per-class methods reduced the Expected Calibration Error without degrading performance. The real world utility of calibration was further evaluated on a crack-segmentation network, where calibration improved without decrease in Dice score or IoU. Furthermore, a Proximal Policy Optimization agent was trained on a 3D Godot environment with calibrated confidence scores as uncertainty estimates extending the observation input. This agent outperformed the baseline policy with a success rate of 0.56 in comparison to 0.33. Note the uncertainty-aware agent learned to approach low-confidence obstacles to verify their nature, showing increased exploration behavior. Overall, this work shows that post-hoc calibration offers a lightweight uncertainty estimation, enabling uncertainty-aware RL agents to utilize this knowledge for verification strategies.