Triplet Neural Networks for the Visual Localization of Mobile Robots
M. Alfaro, J.J. Cabrera, L.M. Jiménez, O. Reinoso, L. Payá
21st International Conference on Informatics in Control, Automation and Robotics  (Porto, Portugal. 18-20 November, 2024)
Ed. ScitePress  ISBN:978-989-758-717-7  ISSN:2184-2809  DOI:10.5220/0000193700003822  - Vol. 2, pp. 125-132

Resumen:


Triplet networks are composed of three identical convolutional neural networks that function in parallel and share their weights. These architectures receive three inputs simultaneously and provide three different outputs, and have demonstrated to have a great potential to tackle visual localization. Therefore, this paper presents an exhaustive study of the main factors that influence the training of a triplet network, which are the choice of the triplet loss function, the selection of samples to include in the training triplets and the batch size. To do that, we have adapted and retrained a network with omnidirectional images, which have been captured in an indoor environment with a catadioptric camera and have been converted into a panoramic format. The experiments conducted demonstrate that triplet networks improve substantially the performance in the visual localization task. However, the right choice of the studied factors is of great importance to fully exploit the potential of such architectures.