Focal Loss in 3D Object Detection
Peng Yun1
Lei Tai1
Yuan Wang1
Chengju Liu2
Ming Liu1
1The Hong Kong University of Science and Technology
2Tongji University
[Download Paper]

IEEE Robotics and Automation Letters, 2019
International Conference on Robotics and Automation (ICRA), Montreal, Canada, 2019

Upper two rows show projected 3D object detection results from the detector trained with binary cross entropy. Lower two rows present related results from the detector trained with the focal loss. Purple and blue bounding boxes are the ground-truth and the estimated results respectively.

3D object detection is still an open problem in autonomous driving scenes. When recognizing and localizing key objects from sparse 3D inputs, autonomous vehicles suffer from a larger continuous searching space and higher fore-background imbalance compared to image-based object detection. In this paper, we aim to solve this fore-background imbalance in 3D object detection. Inspired by the recent use of focal loss in image-based object detection, we extend this hard-mining improvement of binary cross entropy to point-cloud-based object detection and conduct experiments to show its performance based on two different 3D detectors: 3D-FCN and VoxelNet. The evaluation results show up to 11.2AP gains through the focal loss in a wide range of hyperparameters for 3D object detection.




In the paper, in order to control a single variable γ, we firstly make comparisons among last models, which are trained with the same amount of steps. Additionally, we also make comparisonsamong best models to make the conclusion more concrete. The best models are selected accordingto the mean value among easy, moderate and hard 3D detection APs (3D detection mAP). Becauseof the page limitation, the intermediate results of the best model searching are not included in the paper. They are shown in Table A and Table B in this web page, where each row representsthe best model for a specific γ and the bolded numbers are the results in which focal loss casesoutperforms the BCE loss case. The results in Table A and Table B also shows that the cases trained with the focal loss perform better (>1AP) or comparable results (±1AP). Both of our codes and all the intermediate weights are published online.Researchers can easily replicate our results.