In this paper, we discuss and review how combined multiview imagery from satellite to street level can benefit scene analysis. Numerous works exist that merge information from remote sensing and images acquired from the ground for tasks such as object detection, robots guidance, or scene understanding. What makes the combination of overhead and street-level images challenging are the strongly varying viewpoints, the different scales of the images, their illuminations and sensor modality, and time of acquisition. Direct (dense) matching of images on a per-pixel basis is thus often impossible, and one has to resort to alternative strategies that will be discussed in this paper. For such purpose, we review recent works that attempt to combine images taken from the ground and overhead views for purposes like scene registration, reconstruction, or classification. After the theoretical review, we present three recent methods to showcase the interest and potential impact of such fusion on real applications (change detection, image orientation, and tree cataloging), whose logic can then be reused to extend the use of ground-based images in remote sensing and vice versa. Through this review, we advocate that cross fertilization between remote sensing, computer vision, and machine learning is very valuable to make the best of geographic data available from Earth observation sensors and ground imagery. Despite its challenges, we believe that integrating these complementary data sources will lead to major breakthroughs in Big GeoData. It will open new perspectives for this exciting and emerging field.