Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
[2.1] 3D Imaging, Analysis and Applications-Springer-Verlag London (2012).pdf
Скачиваний:
12
Добавлен:
11.12.2021
Размер:
12.61 Mб
Скачать

2

R. Koch et al.

without accompanying color/texture, is referred to by various names, such as a 3D model,1 a 3D scan2 or a 3D image.3

The output of a 3D imaging process can be analyzed and processed to extract information that supports a wide range of applications, such as object recognition, shape search on the web, face recognition for security and surveillance, robot navigation, mapping of the Earth’s surface, forests or urban regions, and clinical procedures in medicine.

Chapter Outline Firstly, in Sect. 1.2, we present a historical perspective on 3D imaging. Since this subject is most widely studied in the context of the modern field of computer vision, Sect. 1.3 briefly outlines the development of computer vision and recommends a number of general texts in this area. In Sect. 1.4, we outline acquisition techniques for 3D imaging. This is followed by a set of twelve relatively modern (post 1970) research papers that we think are significant milestones in 3D imaging and shape analysis and, finally, in Sect. 1.6, we outline some applications of 3D imaging. This chapter concludes by giving a ‘road map’ for the remaining chapters in this book.

1.2 A Historical Perspective on 3D Imaging

To understand the roots of 3D imaging, we first need to consider the history of the more general concepts of image formation and image capture. After this, the remainder of this section discusses binocular depth perception and stereoscopic displays.

1.2.1 Image Formation and Image Capture

Since ancient times, humans have tried to capture their surrounding 3D environment and important aspects of social life on wall paintings. Early drawings, mostly animal paintings, are thought to date back 32,000 years, such as the early works in the Chauvet Cave, France. Drawings in the famous Lascaux Caves near Montinac, France are also very old and date back to around 17,000 years [12]. These drawings were not correct in terms of perspective, but did capture the essence of the objects in an artistic way.

A rigorous mathematical treatment of vision was postulated by Euclid4 in his book Optics [10]. Thus, already early on in history, some aspects of perspectivity

1Typically, this term is used when the 3D data is acquired from multiple viewpoint 2D images.

2Typically, this term is used when a scanner acquired the 3D data, such as a laser stripe scanner.

3Typically, this term is used when the data is ordered in a regular grid, such as the 2D array of depth values in a range image, or a 3D array of data in volumetric medical imaging.

4Euclid of Alexandria, Greek mathematician, also referred to as the Father of Geometry, lived in Alexandria during the reign of Ptolemy I (323–283 BC).

1 Introduction

3

were known. Another very influential mathematical text was the Kitab al-Manazir (Book of Optics) by Alhazen5 [47].

In parallel with the mathematical concepts of vision and optics, physical optics developed by the use of lenses and mirrors, forming the basis of modern optical instruments. Very early lenses were found as polished crystals, like the famous Nimrud lens that was discovered by Austen Henry Layard.6 The lens quality is far from perfect but allows light focusing at a focal point distance of 110 mm. Lenses were used as burning lenses to focus sunlight and as magnification lenses. Early written record of such use is found with Seneca the Younger7 who noted:

Letters, however small and indistinct, are seen enlarged and more clearly through a globe or glass filled with water [33].

Thus, he describes the effect of a spherical convex lens. Early on, the use of such magnification for observing distant objects was recognized and optical instruments were devised, such as corrective lenses for bad eye-sight in the 13th to 15th century CE and the telescope at the beginning of the 17th century. It is unclear who invented the telescope, as several lens makers observed the magnification effects independently. The German born Dutch lens maker Hans Lippershey (1570–1619) from Middelburg, province Zealand, is often credited as inventor of the telescope, since he applied for a patent, which was denied. Other lens makers like his fellow Middelburg lens maker Zacharias Janssen also claiming the invention [28]. Combined with the camera obscura, optically a pinhole camera, they form the basic concept of modern cameras. The camera obscura, Latin for dark room, has been used for a long time to capture images of scenes. Light reflected from a scene enters a dark room through a very small hole and is projected as an image onto the back wall of the room. Already Alhazen had experimented with a camera obscura and it was used as a drawing aid by artists and as a visual attraction later on. The name camera is derived from the camera obscura. The pinhole camera generates an inverse image of the scene with a scale factor f = i/o, where i is the image distance between pinhole and image and o is the object distance between object and pinhole. However, the opening aperture of the pinhole itself has to be very small to avoid blurring. A light-collecting and focusing lens is then used to enlarge the opening aperture and brighter, yet still sharp images can be obtained for thin convex lenses.8 Such lenses follow the Gaussian thin lens equation: 1/f = 1/ i + 1/o, where f is the focal length of the lens. The drawback, as with all modern cameras, is the limited depth of field, in which the image of the scene is in focus.

5Alhazen (Ibn al-Haytham), born 965 CE in Basra, Iraq, died in 1040. Introduced the concept of physical optics and experimented with lenses, mirrors, camera obscura, refraction and reflection.

6Sir Austen Henry Layard (1817–1894), British archaeologist, found a polished rock crystal during the excavation of ancient Nimrud, Iraq. The lens has a diameter of 38 mm, presumed creation date 750–710 BC and now on display at the British Museum, London.

7Lucius Annaeus Seneca, around 4 BC–65 CE, was a Roman philosopher, statesman, dramatist, tutor and adviser of Nero.

8Small and thin bi-convex lenses look like lentils, hence the name lens, which is Latin for lentil.

4

R. Koch et al.

Until the mid-19th century, the only way to capture an image was to manually paint it onto canvas or other suitable background. With the advent of photography,9 images of the real world could be taken and stored for future use. This invention was soon expanded from monochromatic to color images, from monoscopic to stereoscopic10 and from still images to film sequences. In our digital age, electronic sensor devices have taken the role of chemical film and a variety of electronic display technologies have taken over the role of painted pictures.

It is interesting to note, though, that some of the most recent developments in digital photography and image displays have their inspiration in technologies developed over 100 years ago. In 1908, Gabriel Lippmann11 developed the concept of integral photography, a camera composed of very many tiny lenses side by side, in front of a photographic film [34]. These lenses collect view-dependent light rays from all directions onto the film, effectively capturing a three-dimensional field of light rays, the light field [1]. The newly established research field of computational photography has taken on his ideas and is actively developing novel multilens-camera systems for capturing 3D scenes, enhancing the depth of field, or computing novel image transfer functions. In addition, the reverse process of projecting an integral image into space has led to the development of lenticular sheet 3D printing and to auto-stereoscopic (glasses-free) multiview displays that let the observer see the captured 3D scene with full depth parallax without wearing special purpose spectacles. These 3D projection techniques have spawned a huge interest in the display community, both for high-quality auto-stereoscopic displays with full 3D parallax as used in advertisement (3D signage) and for novel 3D-TV display systems that might eventually conquer the 3D-TV home market. This is discussed further in Sect. 1.2.3.

1.2.2 Binocular Perception of Depth

It is important to note that many visual cues give the perception of depth, some of which are monocular cues (occlusion, shading, texture gradients) and some of which are binocular cues (retinal disparity, parallax, eye convergence). Of course, humans, and most predator animals, are equipped with a very sophisticated binocular vision system and it is the binocular cues that provide us with accurate short range depth

9Nicéphore Niépce, 1765–1833, is credited as one of the inventors of photography by solar light etching (Heliograph) in 1826. He later worked with Louis-Jacques-Mandé Daguerre, 1787–1851, who acquired a patent for his Daguerreotype, the first practical photography process based on silver iodide, in 1839. In parallel, William Henry Fox Talbot, 1800–1877, developed the calotype process, which uses paper coated with silver iodide. The calotype produced a negative image from which a positive could be printed using silver chloride coated paper [19].

10The Greek word stereos for solid is used to indicate a spatial 3D extension of vision, hence stereoscopic stands for a 3D form of visual information.

11Gabriel Lippmann, 1845–1921, French scientist, received the 1908 Nobel price in Physics for his method to reproduce color pictures by interferometry.

1 Introduction

5

Fig. 1.1 Left: Human binocular perception of 3D scene. Right: the perceived images of the left and right eye, showing how the depth-dependent disparity results in a parallax shift between foreground and background objects. Both observed images are fused into a 3D sensation by the human eye-brain visual system

perception. Clearly it is advantageous for us to have good depth perception to a distance at least as large as the length of our arms. The principles of binocular vision were already recognized in 1838 by Sir Charles Wheatstone,12 who described the process of binocular perception:

. . . the mind perceives an object of three dimensions by means of the two dissimilar pictures projected by it on the two retinae. . . [54]

The important observation was that the binocular perception of two correctly displaced 2D-images of a scene is equivalent to the perception of the 3D scene itself.

Figure 1.1 illustrates human binocular perception of a 3D scene, comprised of a cone in front of a torus. At the right of this figure are the images perceived by the left and the right eye. If we take a scene point, for example the tip of the cone, this projects to different positions on the left and right retina. The difference between these two positions (retinal correspondences) is known as disparity and the disparity associated with nearby surface points (on the cone) is larger than the disparity associated with more distant points (on the torus). As a result of this difference between foreground and background disparity the position (or alignment) of the foreground relative to the background changes as we shift the viewpoint from the left eye to the right eye. This effect is known as parallax.13

Imagine now that the 3D scene of the cone in front of the torus is observed by a binocular camera with two lenses that are separated horizontally by the inter-eye distance of a human observer. If these images are presented to the left and right eyes of the human observer later on, she or he cannot distinguish the observed real scene from the binocular images of the scene. The images are fused inside the binocular perception of the human observer to form the 3D impression. This observation led

12Sir Charles Wheatstone, 1802–1875, English physicist and inventor.

13The terms disparity and parallax are sometimes used interchangeably in the literature and this misuse of terminology is a source of confusion. One way to think about parallax is that it is induced by the difference in disparity between foreground and background objects over a pair of views displaced by a translation. The end result is that the foreground is in alignment with different parts of the background. Disparity of foreground objects and parallax then only become equivalent when the distance of background objects can be treated as infinity (e.g. distant stars), in this case the background objects are stationary in the image.