PHOTOGRAMMETRIC APPLICATIONS OF IMMERSIVE VIDEO CAMERAS

The paper investigates immersive videography and its application in close-range photogrammetry. Immersive video involves the capture of a live-action scene that presents a 360° field of view. It is recorded simultaneously by multiple cameras or microlenses, where the principal point of each camera is offset from the rotating axis of the device. This issue causes problems when stitching together individual frames of video separated from particular cameras, however there are ways to overcome it and applying immersive cameras in photogrammetry provides a new potential. The paper presents two applications of immersive video in photogrammetry. At first, the creation of a low-cost mobile mapping system based on Ladybug®3 and GPS device is discussed. The amount of panoramas is much too high for photogrammetric purposes as the base line between spherical panoramas is around 1 metre. More than 92 000 panoramas were recorded in one Polish region of Czarny Dunajec and the measurements from panoramas enable the user to measure the area of outdoors (adverting structures) and billboards. A new law is being created in order to limit the number of illegal advertising structures in the Polish landscape and immersive video recorded in a short period of time is a candidate for economical and flexible measurements off-site. The second approach is a generation of 3d video-based reconstructions of heritage sites based on immersive video (structure from immersive video). A mobile camera mounted on a tripod dolly was used to record the interior scene and immersive video, separated into thousands of still panoramas, was converted from video into 3d objects using Agisoft Photoscan Professional. The findings from these experiments demonstrated that immersive photogrammetry seems to be a flexible and prompt method of 3d modelling and provides promising features for mobile mapping systems.


INTRODUCTION
The article explores the potential of immersive videography in photogrammetry.Immersive video involves the capture of a live-action scene recorded using video and covering 360° field of view.All videos must be taken simultaneously and record a separate fragment of the scene.Individual video recordings have to be stitched in order to create one 360° video.Immersive video is still being explored by multiple disciplines such as film industry (Shaw & Weibel, 2003), new media art (Griffiths, 2008;Haslem, 2009;Seo & Gromala, 2007), cultural heritage (Kenderdine, 2007;Kwiatek, 2011), education (Ozkeskin & Tunc, 2010) or computer vision (Benosman & Kang, 2001;Sarmiento & Quintero, 2009), but there is still little research on the use of immersive video in photogrammetry.Spherical photogrammetry introduced by Fangi (2007) refers mainly to individual high resolution spherical images.Schneider and Maas (2003;2005), Parian and Gruen (2004) relate to panoramic photogrammetry and their focus was on 3D reconstructions from multiple panoramas and calibration of panoramic cameras (digital rotating line cameras), but their approaches were explored only on a few high resolution panoramas.The motivation for this paper is to outline the potential of immersive systems in close-range photogrammetry as immersive video is prompt and an economical all-around-view technique of recording the interior of cultural heritage sites at a particular moment and could be used when needed in the future.Additionally, immersive videography has not been considered as scientific tool so far due to mainly low resolution of individual frames and high number of individual imagery.The paper presents two applications of immersive video in photogrammetry.The first one focuses on the experiments in the creation of a low-cost mobile mapping system and its application in Poland, whereas the second part introduces an approach to 3d reconstruction.The aim of this article is to discuss the potential new applications of immersive videography in photogrammetry.

Immersive video
Immersive video is created from multiple cameras arranged together with each camera looking at a specific angle.Immersive video cameras, mounted together by the producer (Ladybug cameras) or by the user (GoPro cameras) (Figure 1), create approximately 10-30 frames (panoramas) per second.360° video is not widely used in photogrammetry due to redundant high frame rate.However, this paper presents concepts based on a mobile immersive camera where this high number of images is necessary for 3d modelling.
The first experiments with immersive projects started in 1900, just 5 years after the birth of cinema.Raoul Grimoin-Sanson projected 10 movies on a 360° cylindrical screen during the Paris Exhibition presenting the suspension of the hot air balloon (Yelin, 2000).The following experiments were further developed by Disney in 1955 (Circle Vision 360) where immersive video was applied to theme parks.The seam between individual videos was often a problem and caused a limited experience of immersion (Michaux, 1999).Digital photography and videography allowed immersive film creators to avoid visible seams by creating overlapping field of views and by the use of image stitching techniques.Although there are single camera devices that use fish-eye lenses or mirror lenses to capture immersive video, the resolution of output 360° video is low and is not considered in this paper.The focus in this paper is on systems that use multiple cameras/microlenses and create medium or high resolution immersive video.

Immersive cameras
Immersive camera is a device or a set of cameras that enable capturing 360° field of view simultaneously in video format.
Figure 2 illustrates a possible arrangement of multiple cameras in immersive recording devices.Vertical field of view of such devices depends on the cameras used, whereas horizontal field of view in immersive video is 360° (Jacobs, 2004).Pintaric (2000) noticed the advantages of immersive video in relation to traditional video.Immersive video overcomes the passive limitations of how video is perceived and presented and also provides each viewer with individual control of the viewing directions.This control over the video occurs in a panoramic video player (Lucid Player, KrPano or Oculus Player).Sample applications for immersive videography include immersive films for planetariums (Yu, 2005), 360° screens (Kwiatek & Woolner, 2010;Piccolin, 2006) or interactive presentations (Larson, 2009).In order to create seamless immersive films, individual video outputs are separated into frames, an immersive rig is calibrated and then particular images are stitched.

Calibration
Calibration is a process of achieving interior orientation of cameras, however this process is not straightforward in the case of offset cameras.Calibration of immersive cameras is a process that allows correct stitching of videos.As indicated in Figure 2, each cameras' principal point is offset and there is a need to create one virtual cylinder (for panoramic device) or one virtual sphere (for spherical device).It is proposed in this paper to achieve interior calibration orientation that rely on stitching images in Hugin and PTGui, which are two stitching software packages.Software calculated one field of view of all images where one virtual spherical camera is created.The results were compared between programs and results of calibration in Hugin and PTGui are presented in Table 1.
The lens distortion parameters (a, b and c) refer to a third degree polynomial expressing radial lens distortion.The lens shift parameters d (horizontal shift) and e (vertical shift) compensate the offset of optical axis which does not fall on the image centre (Panotools NG, 2013).They are calculated on the basis of equations (1).The calculations are almost the same range and size which means that the virtual sphere could be constructed using images with these parameters.A virtual sphere is a sphere where all images are projected on one sphere (Figure 3) and then could be presented in equirectangular form (Figure 4).
(  The comparison was made on one scene presented in Figure 4, which is a real life scene, not a calibration test.In order to achieve more certainty these calibrations should be applied to all frames in the recording, however this was not performed in this experiment as the mobile mapping of one recording consists of around 30,000 x 6 images to stitch.Each Ladybug camera is calibrated individually by the producer.The correct understanding of calibration details and the right positioning of coordinate systems is a key for the measurements from spherical video.The 3D calibration enables computing optimal parallax-free transform for a particular scene, but it is necessary to provide a distance.This means that there will be almost no parallax at a particular distance from the camera and it will be applied to non-overlapping areas.Software provided by Point Grey creates a 3D polygon mesh (Figure 3), which is created on calibration data.Individual images (that generate one panorama) specify how to rectify, rotate and translate images.Ladybug stitching is based on geometry, and there are some problems, because all cameras do not have the same viewpoint (they are slightly offset).The other issue is that the distance to objects is unknown and it must be assumed that all points in the scene are located at the same distance from the camera.It means that the sphere surrounding the camera has one particular radius.For the purpose of further calculations it is assumed that this sphere is a virtual spherical camera with the focal length of 13,2mm for Ladybug®3.
Once the calibration details are known, they are used as template for stitching other images extracted from immersive video.

Stitching immersive video
Stitching of panoramic photographs by a standard camera has been known for more than a decade and a number of freeware and commercial applications were developed such as PTGui, Hugin, Autopano.These applications use state-of-the-art stitching methods (Szeliski, 2005).Stitching of digital images was widely researched in the context of blending (Levin et al., 2004) or seam processing (Summa, Tierny & Pascucci, 2012).
Recent research by Xu (2012) introduced stitching videos and particularly stitching of immersive video.When all cameras record a video there are a few issues which need to be resolved in order to achieve a seamless immersive product.Non-parallax point is difficult to achieve for individual lenses and the optimal point is the centre of rotation.Stitching software such as PTGui or Hugin can usually cope well enough to avoid serious stitching errors.Additionally, the recording of immersive video using a multiple devices (e.g.GoPro cameras) needs to be started at the same time and, if all cameras are not synchronised at the beginning of recording, this issue is most often corrected via audio synchronisation.In the case of multiple devices recording immersive video, individual frames need to be extracted and frames presenting the same moment in time, are stitched using stitching software.The parameters taken from calibration are imported to software for stitching video (e.g.VideoStitch).Panoramic and spherical videography is built from frames in similar way as every film.Typically 25 frames per second are used by television (PAL system).Silent cinema used 16 frames per second.The human vision system can perceive up to 10-12 images individually.As more frames are displayed, they are perceived as a film.Immersive video cameras create medium resolution panoramas, but there are approximately 10-30 frames (panoramas) per second which is too much for photogrammetry as this science prefers less images but in the highest resolution.
Figure 5 illustrates the approach to apply measurements from immersive video and includes two approaches presented in this paper: a mobile mapping system based on immersive video (section 2) and 3d reconstruction (section 3).
Figure 5.The process of applying photogrammetric measurements in immersive video.

IMMERSIVE VIDEOGRAPHY IN MOBILE MAPPING
Advertising chaos is becoming more and more visible in Poland as there are no proper legal regulations that could limit this phenomenon.Outdoor advertising structures (outdoor), billboards and banners are located close to roads, especially those leading to tourist destinations.The road presented in Figure 6 is close to Tatra Mountains and a number of illegal advertising structures are promoting various hotels, shops or products hiding mountain views.The area that is analysed in this paper (Czarny Dunajec district -gmina Czarny Dunajec) is very close to Zakopane district (gmina Zakopane) that has a number of problems with outdoors.The described district has fewer billboards, and it was chosen to investigate the proposed approaches.
Figure 6.Advertising chaos close to Tatra Mountains in the south of Poland.

Mobile mapping
Mobile mapping is the method of collecting geospatial data by the application of a mobile vehicle.There are various sensors which are used in mobile mapping systems: radar, terrestrial laser scanning, LiDAR or photographic.These systems produce, thanks to the integration with navigation sensors, georeferenced images and video.A number of mobile mapping companies are using photographic sensors, but the application of immersive video is still increasing.Iwane Lab (2009) or Horus View and Explore B.V. ( 2014) have employed immersive video and enable users of their software to perform measurements on panoramic videography.Horus View and Explore has developed software for managing 360° video, integrating it with various maps and performing measurements.In order to start working with Horus Movie Player, calibration details from stitching software (Hugin) have to be imported in order to create setup file.The concept of measurement from panoramas, and here from immersive video, is explored in this part of the paper.Every frame has coordinates from GPS. Figure 7 illustrates a short fragment of immersive video where one second of it includes 16 frames in the case of recording performed with Ladybug®3.
16 frames (1 second) time Figure 7.A fragment of immersive video divided into 360° frames.
The vehicle equipped with an immersive camera travels a distance between every frame (at speed 60km/h (16.68 m/s)) of approx. 1 metre.This means that during one second of driving 16 spherical panoramas are recorded.This base line is very short for performing calculations, so to limit the time of processing of immersive videography only the first frames in every second (marked bold in Figure 7) were chosen for further calculations.Figure 8 illustrates the concept of choosing only selected panoramic frames.
1 second time Figure 8. Separated frames from immersive videography.
Performing measurements on immersive videography starts from selecting two points on the panorama and a distance is displayed.Finding the corresponding point on three panoramas allows the user to calculate the area of a billboard in a mobile mapping system which is presented as a rectangle in Figure 9. Figure 10 illustrates measurement of two banners on a fence.The area of each banner is displayed in the centre of measured rectangle.
Figure 9. Multi-click approach.Finding the same corners of banners or billboards on three panoramas.
Figure 10 The calculation of two banners located on the fence.

Immersive video and mobile mapping
Multiple cameras, or one device with multiple lenses connected with other sensors, record trajectories which could be inspected offsite.Immersive video, when integrated with GIS and connected to GPS data, allows measurements of 3D points, lines or areas.The immersive video enables the viewer finding out visually the right place to pause the video and to perform analysis from the driver's or pedestrian's perspective.This article explores the application of immersive video in first tests of creating a mobile mapping system equipped with Ladybug 3 camera and a GPS unit, but without the IMU system.It explores whether the application of only a GPS device allows measurements from panoramas to be performed.According to the authors of Horus software, the relative error achieved is around 10cm, whereas the use of IMU improves the quality of measurements (if the calibration details are known) up to 1cm, but this approach still needs to be tested in the further research.

New lawa new potential for immersive videography
A new law related to the protection of the landscape is being created in Poland now (Ustawa o ochronie krajobrazu), as there is a need to remove a number of illegal adverting structures located close to main roads because they often cover most of the landscape.The project of this act establishes that the owner of the plot on which the advertising structure is located will have to pay for the area of the advert.There is a need to find a solution for local governments who could gather information about outdoors and also measure their areas (Suławko-Karetka & Romański, 2010).What is more, a tool based on immersive video provides the potential for a local government representative to measure, collect data from video and create a database which will be a basis for a new tax.This database will have such data as: plot number, owner's details and area of an outdoor advertisement.

Recording details
The recording time of immersive videography is prompt, almost the same as driving though a particular area using a car, however some distances must be driven twice in order to record some roads.The accuracy of measurements depends on multiple factors such as:  accuracy of GPS device;  availability of IMU system;  resolution of immersive video;  calibration of immersive cameras.The immersive video in the district of Czarny Dunajec was recorded in September 2013.The recording was performed with Ladybug®3 which was placed on a top of a hatchback car (Figure 11).A GPS receiver was also integrated with a laptop that collected data from the camera.Here, immersive videography allows local governments to record quickly all roads in their district and process spherical video on their own computers.The processing allows the video to be paused at a particular moment, and the user can perform 'more click measurement', check the position on a map, find particular plots, add a tag with information about the area of the billboard and store it in GIS system.This approach has a potential income for the local government.The other advantage is that visual system based on immersive video could be updated every year and tags can be actualised according to upto-date recordings.83 kilometres within 92 minutes (total time) were recorded for the first test of the low cost mobile mapping system.The recordings were done at the speed of approx.40-70km/h and the speed was not constant, however the next attempts will be recorded at constant speed of approx.40km/h in order to have a baseline shorter than 16 metres (for 16fps).
Figure 12 illustrates the map of the district (left) and routes (right) which were driven (some of them in both ways).Figure 13 presents the main city (Czarny Dunajec) and the routes marked as circles.
Figure 12.A map of the district of Czarny Dunajec (left) and routes recorded with immersive camera (right).
Figure 13 The map of Czarny Dunajec with overlapped routes of immersive video.
The recordings generated 108 GB of data in PGR format (Ladybug cameras store their recordings in this format).

Further research
Further research on the creation of a mobile mapping system will include the following steps: applying an IMU device, controlling the accuracy of measurements both with and without IMU, creation of a database of outdoors in a particular district and applying rules of panoramic photogrammetry for measurements in immersive video.
The following section focuses on the application of immersive video in 3d modelling.

Videography and Videogrammetry
A number of research groups have been working on 3d reconstructions of heritage sites or urban environments.Remondino and El-Hakim (2006) review close range photogrammetry techniques for precise 3D modelling and they state that image-based modelling is the "most complete, economical, portable, flexible and widely used approach".In recent years, 3d reconstruction from video became more common; in particular automatic approaches that attempt to obtain a 3D model of the particular scene from uncalibrated images.This method is called "shape from video", "VHS to VRML" or "Video-To-3D" (Remondino & El-Hakim, 2006).Clipp et al. (2008) describe a process of automatic reconstruction of large scale urban scenes where panoramic camera is used, however they only use a part of spherical images and the application of panoramic camera has not been investigated in their research (Singh, Jain & Mand, 2014).The process of using video techniques and applying measurements is called videogrammetry and it refers mainly to video images taken using camcorder or movie function on a digital still camera (Gruen, 1997).However, there is little research on the application of immersive videography in videogrammetry.360° video has the potential to become one of the most economical and flexible approaches to create 3d reconstructions of the sites and this paper provides an example of 3d modelling of cloisters which is an indoor scene without moving objects.

3D interactive navigable environment
Although the resolution of individual panoramas taken from immersive videography is medium (e.g.equirectangular size: 3500x1700), the approach presented in this paper explores whether an interactive navigable 3D environment can be promptly created using immersive videography on a mobile setup.
The project investigated in this section is created in Agisoft Photoscan Professional that enables spherical images to be imported.The approach described in this paper differs slightly from the concept of spherical photogrammetry (D'Annibale & Fangi, 2009;Fangi, 2007;Fangi, 2009;Pisa, Zeppa & Fangi, 2010, Barazzetti et. al, 2010).Table 3 compares the discussed approach of immersive photogrammetry proposed in this paper and Fangi's approaches of spherical photogrammetry to create 3D reconstructions from spherical panoramas (or spherical panoramas taken from immersive video as individual frames).
Further research will answer the question about the accuracy of immersive photogrammetry.

3d reconstruction
The recording within the cloisters of the Church of St. Francis of Assisi in Krakow in Poland was performed with Ladybug®2, spherical video camera, on a mobile setup and individual frames from immersive video were generated.Particular frames were then imported to Agisoft Photoscan Professional.The approach of the application of spherical panoramas exported as individual frames from the stream PGR file recorded with Ladybug system.A camera was mounted on a tripod dolly and moved with a constant speed across the recorded area.The total recording time took 18 minutes and created about 9.8 GB of data (PGR files).The scene did not contain moving objects or people who could cause errors in automatic 3D modelling of the scene.A look inside a 3D reconstructed object presents the positions of individual panoramas within the object (Figure 15).As in the first example described in the previous section, the number of panoramas were limited and only one panorama in a second was chosen for further analysis.The output from Agisoft was saved as a PDF (3D PDF) and as a 3D navigable virtual environment that was then inserted into a Flare3D environment for importing 3DS scenes from 3dmax into Flash (Figure 16 and 17) and the virtual tour through the site was available, shortly after the recording period.

Further research
Further research will include the analysis of photogrammetric features of the 3d reconstruction.In addition, when high resolution immersive video is applied, this should improve the process of 3d reconstructions, but it will expand processing time.Thus, accuracy needs to be explored.

Figure 1 .
Figure 1.Immersive videography cameras (Ladybug®3, Panono camera and 360Heros built from seven GoPro cameras) A general idea of positioning individual cameras in immersive system.a) a top view of panoramic rig; b) a front view of spherical device with one lens looking up.

Figure 4 .
Figure 4. Stitching one of the frames from Ladybug®3 in PTGui.

Figure 11
Figure 11 Ladybug®3 mounted at the top of the car in the district of Czarny Dunajec.

Figure 14 .
Figure 14.A geometry of the cloisters created in Agisoft Photoscan from imported panoramas extracted from immersive video.

Figure 15 .
Figure 15.A view inside the 3d reconstructed cloisters where the position of panoramas is presented.

Figure 16 .
Figure 16.Flare3D viewer enables to have a 3D virtual tour through the object.

Figure 17 .
Figure 17.Flare3D viewer presents 3D model exported to 3sd max; it presents only a short fragment of the cloisters and the central part is of good quality, but the border regions are at not of good quality.

Table 2 .
Table2includes information about data that was processed and does not include recordings which were incorrectly captured during one day session of driving through the district.The recordings contain 92798 frames in total.Statistics of the recordings in the district of Czarny Dunajec.

Table 3 .
Comparison between spherical photogrammetry and immersive photogrammetry proposed in this paper.