Abstract: Cross-modal retrieval (CMR) enables flexible retrieval experience across different modalities (e.g., texts versus images), which maximally benefits us from the abundance of multimedia data.
Abstract: To achieve more accurate perception performance, LiDAR and camera are gradually chosen to improve 3D object detection simultaneously. However, it is still a non-trivial task to build an ...