- •Preface
- •Biological Vision Systems
- •Visual Representations from Paintings to Photographs
- •Computer Vision
- •The Limitations of Standard 2D Images
- •3D Imaging, Analysis and Applications
- •Book Objective and Content
- •Acknowledgements
- •Contents
- •Contributors
- •2.1 Introduction
- •Chapter Outline
- •2.2 An Overview of Passive 3D Imaging Systems
- •2.2.1 Multiple View Approaches
- •2.2.2 Single View Approaches
- •2.3 Camera Modeling
- •2.3.1 Homogeneous Coordinates
- •2.3.2 Perspective Projection Camera Model
- •2.3.2.1 Camera Modeling: The Coordinate Transformation
- •2.3.2.2 Camera Modeling: Perspective Projection
- •2.3.2.3 Camera Modeling: Image Sampling
- •2.3.2.4 Camera Modeling: Concatenating the Projective Mappings
- •2.3.3 Radial Distortion
- •2.4 Camera Calibration
- •2.4.1 Estimation of a Scene-to-Image Planar Homography
- •2.4.2 Basic Calibration
- •2.4.3 Refined Calibration
- •2.4.4 Calibration of a Stereo Rig
- •2.5 Two-View Geometry
- •2.5.1 Epipolar Geometry
- •2.5.2 Essential and Fundamental Matrices
- •2.5.3 The Fundamental Matrix for Pure Translation
- •2.5.4 Computation of the Fundamental Matrix
- •2.5.5 Two Views Separated by a Pure Rotation
- •2.5.6 Two Views of a Planar Scene
- •2.6 Rectification
- •2.6.1 Rectification with Calibration Information
- •2.6.2 Rectification Without Calibration Information
- •2.7 Finding Correspondences
- •2.7.1 Correlation-Based Methods
- •2.7.2 Feature-Based Methods
- •2.8 3D Reconstruction
- •2.8.1 Stereo
- •2.8.1.1 Dense Stereo Matching
- •2.8.1.2 Triangulation
- •2.8.2 Structure from Motion
- •2.9 Passive Multiple-View 3D Imaging Systems
- •2.9.1 Stereo Cameras
- •2.9.2 3D Modeling
- •2.9.3 Mobile Robot Localization and Mapping
- •2.10 Passive Versus Active 3D Imaging Systems
- •2.11 Concluding Remarks
- •2.12 Further Reading
- •2.13 Questions
- •2.14 Exercises
- •References
- •3.1 Introduction
- •3.1.1 Historical Context
- •3.1.2 Basic Measurement Principles
- •3.1.3 Active Triangulation-Based Methods
- •3.1.4 Chapter Outline
- •3.2 Spot Scanners
- •3.2.1 Spot Position Detection
- •3.3 Stripe Scanners
- •3.3.1 Camera Model
- •3.3.2 Sheet-of-Light Projector Model
- •3.3.3 Triangulation for Stripe Scanners
- •3.4 Area-Based Structured Light Systems
- •3.4.1 Gray Code Methods
- •3.4.1.1 Decoding of Binary Fringe-Based Codes
- •3.4.1.2 Advantage of the Gray Code
- •3.4.2 Phase Shift Methods
- •3.4.2.1 Removing the Phase Ambiguity
- •3.4.3 Triangulation for a Structured Light System
- •3.5 System Calibration
- •3.6 Measurement Uncertainty
- •3.6.1 Uncertainty Related to the Phase Shift Algorithm
- •3.6.2 Uncertainty Related to Intrinsic Parameters
- •3.6.3 Uncertainty Related to Extrinsic Parameters
- •3.6.4 Uncertainty as a Design Tool
- •3.7 Experimental Characterization of 3D Imaging Systems
- •3.7.1 Low-Level Characterization
- •3.7.2 System-Level Characterization
- •3.7.3 Characterization of Errors Caused by Surface Properties
- •3.7.4 Application-Based Characterization
- •3.8 Selected Advanced Topics
- •3.8.1 Thin Lens Equation
- •3.8.2 Depth of Field
- •3.8.3 Scheimpflug Condition
- •3.8.4 Speckle and Uncertainty
- •3.8.5 Laser Depth of Field
- •3.8.6 Lateral Resolution
- •3.9 Research Challenges
- •3.10 Concluding Remarks
- •3.11 Further Reading
- •3.12 Questions
- •3.13 Exercises
- •References
- •4.1 Introduction
- •Chapter Outline
- •4.2 Representation of 3D Data
- •4.2.1 Raw Data
- •4.2.1.1 Point Cloud
- •4.2.1.2 Structured Point Cloud
- •4.2.1.3 Depth Maps and Range Images
- •4.2.1.4 Needle map
- •4.2.1.5 Polygon Soup
- •4.2.2 Surface Representations
- •4.2.2.1 Triangular Mesh
- •4.2.2.2 Quadrilateral Mesh
- •4.2.2.3 Subdivision Surfaces
- •4.2.2.4 Morphable Model
- •4.2.2.5 Implicit Surface
- •4.2.2.6 Parametric Surface
- •4.2.2.7 Comparison of Surface Representations
- •4.2.3 Solid-Based Representations
- •4.2.3.1 Voxels
- •4.2.3.3 Binary Space Partitioning
- •4.2.3.4 Constructive Solid Geometry
- •4.2.3.5 Boundary Representations
- •4.2.4 Summary of Solid-Based Representations
- •4.3 Polygon Meshes
- •4.3.1 Mesh Storage
- •4.3.2 Mesh Data Structures
- •4.3.2.1 Halfedge Structure
- •4.4 Subdivision Surfaces
- •4.4.1 Doo-Sabin Scheme
- •4.4.2 Catmull-Clark Scheme
- •4.4.3 Loop Scheme
- •4.5 Local Differential Properties
- •4.5.1 Surface Normals
- •4.5.2 Differential Coordinates and the Mesh Laplacian
- •4.6 Compression and Levels of Detail
- •4.6.1 Mesh Simplification
- •4.6.1.1 Edge Collapse
- •4.6.1.2 Quadric Error Metric
- •4.6.2 QEM Simplification Summary
- •4.6.3 Surface Simplification Results
- •4.7 Visualization
- •4.8 Research Challenges
- •4.9 Concluding Remarks
- •4.10 Further Reading
- •4.11 Questions
- •4.12 Exercises
- •References
- •1.1 Introduction
- •Chapter Outline
- •1.2 A Historical Perspective on 3D Imaging
- •1.2.1 Image Formation and Image Capture
- •1.2.2 Binocular Perception of Depth
- •1.2.3 Stereoscopic Displays
- •1.3 The Development of Computer Vision
- •1.3.1 Further Reading in Computer Vision
- •1.4 Acquisition Techniques for 3D Imaging
- •1.4.1 Passive 3D Imaging
- •1.4.2 Active 3D Imaging
- •1.4.3 Passive Stereo Versus Active Stereo Imaging
- •1.5 Twelve Milestones in 3D Imaging and Shape Analysis
- •1.5.1 Active 3D Imaging: An Early Optical Triangulation System
- •1.5.2 Passive 3D Imaging: An Early Stereo System
- •1.5.3 Passive 3D Imaging: The Essential Matrix
- •1.5.4 Model Fitting: The RANSAC Approach to Feature Correspondence Analysis
- •1.5.5 Active 3D Imaging: Advances in Scanning Geometries
- •1.5.6 3D Registration: Rigid Transformation Estimation from 3D Correspondences
- •1.5.7 3D Registration: Iterative Closest Points
- •1.5.9 3D Local Shape Descriptors: Spin Images
- •1.5.10 Passive 3D Imaging: Flexible Camera Calibration
- •1.5.11 3D Shape Matching: Heat Kernel Signatures
- •1.6 Applications of 3D Imaging
- •1.7 Book Outline
- •1.7.1 Part I: 3D Imaging and Shape Representation
- •1.7.2 Part II: 3D Shape Analysis and Processing
- •1.7.3 Part III: 3D Imaging Applications
- •References
- •5.1 Introduction
- •5.1.1 Applications
- •5.1.2 Chapter Outline
- •5.2 Mathematical Background
- •5.2.1 Differential Geometry
- •5.2.2 Curvature of Two-Dimensional Surfaces
- •5.2.3 Discrete Differential Geometry
- •5.2.4 Diffusion Geometry
- •5.2.5 Discrete Diffusion Geometry
- •5.3 Feature Detectors
- •5.3.1 A Taxonomy
- •5.3.2 Harris 3D
- •5.3.3 Mesh DOG
- •5.3.4 Salient Features
- •5.3.5 Heat Kernel Features
- •5.3.6 Topological Features
- •5.3.7 Maximally Stable Components
- •5.3.8 Benchmarks
- •5.4 Feature Descriptors
- •5.4.1 A Taxonomy
- •5.4.2 Curvature-Based Descriptors (HK and SC)
- •5.4.3 Spin Images
- •5.4.4 Shape Context
- •5.4.5 Integral Volume Descriptor
- •5.4.6 Mesh Histogram of Gradients (HOG)
- •5.4.7 Heat Kernel Signature (HKS)
- •5.4.8 Scale-Invariant Heat Kernel Signature (SI-HKS)
- •5.4.9 Color Heat Kernel Signature (CHKS)
- •5.4.10 Volumetric Heat Kernel Signature (VHKS)
- •5.5 Research Challenges
- •5.6 Conclusions
- •5.7 Further Reading
- •5.8 Questions
- •5.9 Exercises
- •References
- •6.1 Introduction
- •Chapter Outline
- •6.2 Registration of Two Views
- •6.2.1 Problem Statement
- •6.2.2 The Iterative Closest Points (ICP) Algorithm
- •6.2.3 ICP Extensions
- •6.2.3.1 Techniques for Pre-alignment
- •Global Approaches
- •Local Approaches
- •6.2.3.2 Techniques for Improving Speed
- •Subsampling
- •Closest Point Computation
- •Distance Formulation
- •6.2.3.3 Techniques for Improving Accuracy
- •Outlier Rejection
- •Additional Information
- •Probabilistic Methods
- •6.3 Advanced Techniques
- •6.3.1 Registration of More than Two Views
- •Reducing Error Accumulation
- •Automating Registration
- •6.3.2 Registration in Cluttered Scenes
- •Point Signatures
- •Matching Methods
- •6.3.3 Deformable Registration
- •Methods Based on General Optimization Techniques
- •Probabilistic Methods
- •6.3.4 Machine Learning Techniques
- •Improving the Matching
- •Object Detection
- •6.4 Quantitative Performance Evaluation
- •6.5 Case Study 1: Pairwise Alignment with Outlier Rejection
- •6.6 Case Study 2: ICP with Levenberg-Marquardt
- •6.6.1 The LM-ICP Method
- •6.6.2 Computing the Derivatives
- •6.6.3 The Case of Quaternions
- •6.6.4 Summary of the LM-ICP Algorithm
- •6.6.5 Results and Discussion
- •6.7 Case Study 3: Deformable ICP with Levenberg-Marquardt
- •6.7.1 Surface Representation
- •6.7.2 Cost Function
- •Data Term: Global Surface Attraction
- •Data Term: Boundary Attraction
- •Penalty Term: Spatial Smoothness
- •Penalty Term: Temporal Smoothness
- •6.7.3 Minimization Procedure
- •6.7.4 Summary of the Algorithm
- •6.7.5 Experiments
- •6.8 Research Challenges
- •6.9 Concluding Remarks
- •6.10 Further Reading
- •6.11 Questions
- •6.12 Exercises
- •References
- •7.1 Introduction
- •7.1.1 Retrieval and Recognition Evaluation
- •7.1.2 Chapter Outline
- •7.2 Literature Review
- •7.3 3D Shape Retrieval Techniques
- •7.3.1 Depth-Buffer Descriptor
- •7.3.1.1 Computing the 2D Projections
- •7.3.1.2 Obtaining the Feature Vector
- •7.3.1.3 Evaluation
- •7.3.1.4 Complexity Analysis
- •7.3.2 Spin Images for Object Recognition
- •7.3.2.1 Matching
- •7.3.2.2 Evaluation
- •7.3.2.3 Complexity Analysis
- •7.3.3 Salient Spectral Geometric Features
- •7.3.3.1 Feature Points Detection
- •7.3.3.2 Local Descriptors
- •7.3.3.3 Shape Matching
- •7.3.3.4 Evaluation
- •7.3.3.5 Complexity Analysis
- •7.3.4 Heat Kernel Signatures
- •7.3.4.1 Evaluation
- •7.3.4.2 Complexity Analysis
- •7.4 Research Challenges
- •7.5 Concluding Remarks
- •7.6 Further Reading
- •7.7 Questions
- •7.8 Exercises
- •References
- •8.1 Introduction
- •Chapter Outline
- •8.2 3D Face Scan Representation and Visualization
- •8.3 3D Face Datasets
- •8.3.1 FRGC v2 3D Face Dataset
- •8.3.2 The Bosphorus Dataset
- •8.4 3D Face Recognition Evaluation
- •8.4.1 Face Verification
- •8.4.2 Face Identification
- •8.5 Processing Stages in 3D Face Recognition
- •8.5.1 Face Detection and Segmentation
- •8.5.2 Removal of Spikes
- •8.5.3 Filling of Holes and Missing Data
- •8.5.4 Removal of Noise
- •8.5.5 Fiducial Point Localization and Pose Correction
- •8.5.6 Spatial Resampling
- •8.5.7 Feature Extraction on Facial Surfaces
- •8.5.8 Classifiers for 3D Face Matching
- •8.6 ICP-Based 3D Face Recognition
- •8.6.1 ICP Outline
- •8.6.2 A Critical Discussion of ICP
- •8.6.3 A Typical ICP-Based 3D Face Recognition Implementation
- •8.6.4 ICP Variants and Other Surface Registration Approaches
- •8.7 PCA-Based 3D Face Recognition
- •8.7.1 PCA System Training
- •8.7.2 PCA Training Using Singular Value Decomposition
- •8.7.3 PCA Testing
- •8.7.4 PCA Performance
- •8.8 LDA-Based 3D Face Recognition
- •8.8.1 Two-Class LDA
- •8.8.2 LDA with More than Two Classes
- •8.8.3 LDA in High Dimensional 3D Face Spaces
- •8.8.4 LDA Performance
- •8.9 Normals and Curvature in 3D Face Recognition
- •8.9.1 Computing Curvature on a 3D Face Scan
- •8.10 Recent Techniques in 3D Face Recognition
- •8.10.1 3D Face Recognition Using Annotated Face Models (AFM)
- •8.10.2 Local Feature-Based 3D Face Recognition
- •8.10.2.1 Keypoint Detection and Local Feature Matching
- •8.10.2.2 Other Local Feature-Based Methods
- •8.10.3 Expression Modeling for Invariant 3D Face Recognition
- •8.10.3.1 Other Expression Modeling Approaches
- •8.11 Research Challenges
- •8.12 Concluding Remarks
- •8.13 Further Reading
- •8.14 Questions
- •8.15 Exercises
- •References
- •9.1 Introduction
- •Chapter Outline
- •9.2 DEM Generation from Stereoscopic Imagery
- •9.2.1 Stereoscopic DEM Generation: Literature Review
- •9.2.2 Accuracy Evaluation of DEMs
- •9.2.3 An Example of DEM Generation from SPOT-5 Imagery
- •9.3 DEM Generation from InSAR
- •9.3.1 Techniques for DEM Generation from InSAR
- •9.3.1.1 Basic Principle of InSAR in Elevation Measurement
- •9.3.1.2 Processing Stages of DEM Generation from InSAR
- •The Branch-Cut Method of Phase Unwrapping
- •The Least Squares (LS) Method of Phase Unwrapping
- •9.3.2 Accuracy Analysis of DEMs Generated from InSAR
- •9.3.3 Examples of DEM Generation from InSAR
- •9.4 DEM Generation from LIDAR
- •9.4.1 LIDAR Data Acquisition
- •9.4.2 Accuracy, Error Types and Countermeasures
- •9.4.3 LIDAR Interpolation
- •9.4.4 LIDAR Filtering
- •9.4.5 DTM from Statistical Properties of the Point Cloud
- •9.5 Research Challenges
- •9.6 Concluding Remarks
- •9.7 Further Reading
- •9.8 Questions
- •9.9 Exercises
- •References
- •10.1 Introduction
- •10.1.1 Allometric Modeling of Biomass
- •10.1.2 Chapter Outline
- •10.2 Aerial Photo Mensuration
- •10.2.1 Principles of Aerial Photogrammetry
- •10.2.1.1 Geometric Basis of Photogrammetric Measurement
- •10.2.1.2 Ground Control and Direct Georeferencing
- •10.2.2 Tree Height Measurement Using Forest Photogrammetry
- •10.2.2.2 Automated Methods in Forest Photogrammetry
- •10.3 Airborne Laser Scanning
- •10.3.1 Principles of Airborne Laser Scanning
- •10.3.1.1 Lidar-Based Measurement of Terrain and Canopy Surfaces
- •10.3.2 Individual Tree-Level Measurement Using Lidar
- •10.3.2.1 Automated Individual Tree Measurement Using Lidar
- •10.3.3 Area-Based Approach to Estimating Biomass with Lidar
- •10.4 Future Developments
- •10.5 Concluding Remarks
- •10.6 Further Reading
- •10.7 Questions
- •References
- •11.1 Introduction
- •Chapter Outline
- •11.2 Volumetric Data Acquisition
- •11.2.1 Computed Tomography
- •11.2.1.1 Characteristics of 3D CT Data
- •11.2.2 Positron Emission Tomography (PET)
- •11.2.2.1 Characteristics of 3D PET Data
- •Relaxation
- •11.2.3.1 Characteristics of the 3D MRI Data
- •Image Quality and Artifacts
- •11.2.4 Summary
- •11.3 Surface Extraction and Volumetric Visualization
- •11.3.1 Surface Extraction
- •Example: Curvatures and Geometric Tools
- •11.3.2 Volume Rendering
- •11.3.3 Summary
- •11.4 Volumetric Image Registration
- •11.4.1 A Hierarchy of Transformations
- •11.4.1.1 Rigid Body Transformation
- •11.4.1.2 Similarity Transformations and Anisotropic Scaling
- •11.4.1.3 Affine Transformations
- •11.4.1.4 Perspective Transformations
- •11.4.1.5 Non-rigid Transformations
- •11.4.2 Points and Features Used for the Registration
- •11.4.2.1 Landmark Features
- •11.4.2.2 Surface-Based Registration
- •11.4.2.3 Intensity-Based Registration
- •11.4.3 Registration Optimization
- •11.4.3.1 Estimation of Registration Errors
- •11.4.4 Summary
- •11.5 Segmentation
- •11.5.1 Semi-automatic Methods
- •11.5.1.1 Thresholding
- •11.5.1.2 Region Growing
- •11.5.1.3 Deformable Models
- •Snakes
- •Balloons
- •11.5.2 Fully Automatic Methods
- •11.5.2.1 Atlas-Based Segmentation
- •11.5.2.2 Statistical Shape Modeling and Analysis
- •11.5.3 Summary
- •11.6 Diffusion Imaging: An Illustration of a Full Pipeline
- •11.6.1 From Scalar Images to Tensors
- •11.6.2 From Tensor Image to Information
- •11.6.3 Summary
- •11.7 Applications
- •11.7.1 Diagnosis and Morphometry
- •11.7.2 Simulation and Training
- •11.7.3 Surgical Planning and Guidance
- •11.7.4 Summary
- •11.8 Concluding Remarks
- •11.9 Research Challenges
- •11.10 Further Reading
- •Data Acquisition
- •Surface Extraction
- •Volume Registration
- •Segmentation
- •Diffusion Imaging
- •Software
- •11.11 Questions
- •11.12 Exercises
- •References
- •Index
9 3D Digital Elevation Model Generation |
395 |
9.4.3 LIDAR Interpolation
In general, there are two principal ways to prepare LIDAR data for filtering techniques: gridding or working on the original point cloud. On the one hand, some researchers suggest the use of the original data for accuracy reasons, since gridding and interpolation may omit original data and induce additional non-existent data. On the other hand, gridding and interpolation allows the use and further advancement of well-known image processing techniques [47], such as filtering in the spatial and frequency domain. For applications requiring data fusion, different bands have to be co-registered to obtain a consistent format. Typical spatial resolutions for gridding are in the order of 0.25–2 m [104]. Commonly used interpolation techniques include nearest neighborhood and bilinear interpolation [34, 111, 146, 167, 181].
Bater et al. [20] compared seven interpolation methods applied to LIDAR data at different resolutions and measured the interpolation error as RMSE. Conclusions were that, although the interpolation error is independent of the interpolation method, it is a function of the resolution, where 0.5 m increase in spatial resolution yields an RMSE increase of 1 cm on average. Therefore, the authors suggested employing a natural neighbor algorithm with minimal interpolation error and computational costs. Natural or nearest neighbor interpolation suggests that missing data can be synthesized by the mean (or median) of adjacent data points based on the assumption that they are similar.
9.4.4 LIDAR Filtering
The main interest in airborne LIDAR data is to generate accurate elevation maps and to identify objects within the point cloud. The main products include Digital Surface Models (DSM), Digital Terrain Models (DTM) and normalized DSMs (nDSM) [143]. A DSM includes all sampled top surfaces of objects and ground [14, 136, 138], including tree canopies, roof tops, chimneys or above ground power transmission lines, as illustrated in Fig. 9.12. A DSM can also be directly produced from the LIDAR point cloud. Since many applications require accurate ground data, one major goal in LIDAR filtering is to separate object and ground points. If object points are of interest, an nDSM can then be generated very conveniently by subtracting the DTM from the DSM [172]. Again, an interpolation step is required to create a DEM from the filtered data (to fill those patches where objects were with appropriate data).
LIDAR filtering is the bottle-neck in the processing chain from acquisition to real applications. Friess [50] reported that the ratio of post-processing to acquisition time for correcting the points’ geometry is up to 14:1 and DEM generation demands an even higher ratio of 20:1. Filtering algorithms constitute the major contribution to a final product derived from LIDAR [167], taking three times longer than the acquisition. Flood [43] reported that at least 60 % of the production time to obtain DEMs is allocated for filtering and other post-processing steps. Although LIDAR
396 |
H. Wei and M. Bartels |
Fig. 9.12 From top to bottom: Original scene, DSM, DTM, nDSM
systems are considered to be mature, there is much work to de done to develop filtering algorithms that are robust, accurate and efficient [10, 13, 162]. In the following, a brief overview of the different categories of LIDAR filtering techniques is given.
Morphological filtering does two things: filters object points and interpolates the missing terrain at the same time. Based on set theory, morphological filtering consists of two basic operators, erosion and dilation [65, 66], usually applied in a sliding window. Further derived operators are cleaning, filling, bridging and the watershed algorithm. When applied to LIDAR data, the major assumption is that there is a distinct difference between slopes of objects and of ground [172]. The first step is to find the minimum LIDAR point in a window [90] where the difficulty here is to select the right window size according to shape, size and orientation of objects [8]. Small windows relative to large objects result in treating them as ground. Large windows, however, may erode terrain irreversibly. Therefore, the latest developments in
9 3D Digital Elevation Model Generation |
397 |
morphological filtering make use of progressive [181] increasing window sizes [90]. In general, morphological filtering is considered to be robust and works well for isolated objects. However, some drawbacks include the indifference of this filtering technique towards object and terrain details, which can be eroded irreversibly [32].
Slope-based filtering exploits slope and discontinuities between adjacent single or grouped points (e.g. buildings) since LIDAR data is vertically accurate in the range of decimeters. Vosselman [155] employed a predetermined cut-off plane to measure and classify slopes of lines between two points with parameters to adapt the plane to highly sloped terrain [142]. Slope-based filtering is non-trivial since extreme slopes and flat terrain can cause failure of the algorithm. Therefore, the Laplacian of Gaussian (LoG) is often used to differentiate objects from the ground [55]. A grid based DTM generation approach was developed by Wack et al. [163, 164] who downsampled the data coarsely first to a 9 m spacing. By doing so, most of the object points were discarded. Roggero [128] derived a DTM by using weighted linear regression on original, irregular LIDAR data. The authors then employed a local inverted cone adapted to the terrain slope in order to estimate ground points, regarding the maximum height difference between two points. Objects were then detected using a LoG. In general, slope-based algorithms pick up precisely edges, lines, corners and other discontinuities. If spatial masks are used, gridding is required. Slope-based filtering algorithms fail if their major assumption steep slopes are only caused by objects is violated; for example, in mountainous terrain with discontinuities such as trenches, manhole covers for canalisation, holes, caves or outliers.
Geometry-based filtering is a popular method to describe man-made and natural objects in LIDAR data, assuming that object properties such as shape, length, width, height, position and orientation are known. From an nDSM obtained by means of morphological filtering, Weidner and Förstner [172] extracted buildings from a parametric and prismatic building model [169] based on the Minimum Description Length (MDL) principle [170]. First, a bounding box for each building is estimated using a defined threshold with respect to known building heights. Second, the segmentation is refined based on a height threshold calculated from the median of 10 % lowest and 10 % highest points of the preselected data. Sparse Triangular Irregular Network (TIN) densification for DTM generation is also very popular [145] since neither gridding and interpolation nor dense point clouds are required [96]. Axelsson [10] classified vegetation, buildings and power transmission lines in original LIDAR data, based on their surface in a TIN using MDL criteria. The underlying assumption was that objects consist of planar faces and therefore, neighbouring TIN facets had a similar orientation with a second derivative of zero whereas the second derivatives of breaklines and vegetation points were non-zero. One of the biggest advantages of geometry-based filtering is that structures in LIDAR data can be described directly from point clouds. Buildings can be recovered very efficiently and initial gridding and interpolating the data is not necessary. However, geometry-based filtering requires many predefined parameters and becomes more difficult when the level of object complexity is high.
Curvature-based filters have been developed to detect classes of curved areas (i.e. convex, concave and plane) within the point cloud [6, 153]. Vosselman [154]
398 |
H. Wei and M. Bartels |
assumed that buildings consist of planar faces which can be recognized by applying the Hough transform in LIDAR point clouds. The data had been segmented with existing 2D ground plans [156, 157]. The spatial structure chosen for the geometric description of the planar faces were 2D Delaunay triangles [154]. If a connected component was greater than a specified threshold, a planar face was found. The buildings’ outlines were estimated using the planar faces projected into the 2D space [154]. Working on non-gridded data, Rottensteiner and Briese [131] detected non-ground building regions in the first stage. Then, roofs were detected using surface normals of a DSM [134]. The biggest advantage of curvature-based LIDAR filters is that they are directly applicable to the original data to detect structures in a LIDAR point cloud. Drawbacks, however, are that only man-made objects that have planar, convex or concave faces can be detected with pre-defined thresholds and that vegetation is difficult to recognize due to its random character.
Linear prediction can be used for DTM generation and gross error removal [26], as was developed by Kraus and Pfeifer [93]. The supervized algorithm contained a combination of filtering (object point classification) and interpolation of the ground. First, a rough approximation of a DTM was calculated using linear prediction employing overlapping patches. The difference of this DTM and the original LIDAR point cloud was estimated to obtain residuals as the basis for assigning weights to LIDAR points from a parametric weighting function. Points above a certain threshold were classified as object points. The whole algorithm was iteratively executed until either a stable situation or a predefined number of iterations was reached [93]. Based on linear prediction, Kobler et al. [91] addressed the challenge of filtering LIDAR data in steep wooded terrain with a repetitive interpolation algorithm by setting empirically estimated parameters, thresholds and a buffer zone [94]. A major advantage of linear prediction models is their applicability to different terrain types and that DTMs can be derived from sloped terrain. The compromise to this solution however is that prior knowledge of the terrain for the weighting factors and thresholds are required. Those parameters can be adjusted depending on the application and terrain. Also, some approaches lack a clear termination criterion, as either a stable situation has to be reached or the number of iterations has to be specified in advance.
Coarse-to-fine filters and multi-resolution filters benefit from both detailed and coarse views of the LIDAR data [93]. Pfeifer et al. [120] presented a series of LIDAR post-processing steps aimed at flood modeling: data calibration, strip adjustment, ‘robust linear prediction’ and terrain structure recovery (i.e. breakline modeling). The new element contributing to the robustness of the filtering approach was the use of hierarchical pyramids. DTMs of different resolutions were compared to each other, however, still requiring some thresholds [120]. References [139–141] presented a semi-automatic, multi-resolution approach for filtering LIDAR data based on the Hermite transform. Gaussian pyramids were employed to transform the data in the first step. An assumption was made that any change from ground and off-terrain is a linear combination of shifted, rotated and scaled error functions, similar to a kernel function in Wavelet analysis. A multi-resolution algorithm for DTM generation from LIDAR data was proposed by Vu et al. [159–162] who compared
9 3D Digital Elevation Model Generation |
399 |
successive median-filtered resolutions of gridded LIDAR data to detect boundaries. The final segmentation of the LIDAR data was achieved using both the boundaries and the actual height as features. A Wavelet approach to separate ground and object points on gridded LIDAR data was proposed by Vu and Tokunaga [158]. The authors applied K-means on height to segment buildings, motorways, boundaries and trees. The advantage of multi-resolution filtering algorithms is the separation of low and high frequencies (i.e. approximations and details) of a LIDAR scene. However, it is not always clear what resolution to choose since energy (i.e. information contents) becomes smaller the further the signal is decomposed [18]. When using approximations (i.e. discarding the high frequency components) a loss of information occurs. Also, higher computational costs and memory requirements have to be considered, when using multi-resolution filters.
Knowledge-based filters make use of predefined models of different height and shape [22]. Haala and Brenner [60] presented a complete solution of a realistic 3D city model from LIDAR derived DSMs. To get the buildings’ ground planes, the authors segmented the DSM using a 2D GIS map whose incompleteness and inaccuracy was complemented with a cadastral map. Their algorithm was based on region growing using straight lines of pixels [59] and histogram analysis of surface vector normals which yielded planar surfaces [62]. Having obtained the ground planes, the actual buildings were modeled by fitting a limited number of predefined 3D building primitives into the DSM. Location and rough type of vegetation were estimated with three bands of aerial Color Infra-Red (CIR) images (i.e. NIR, red and green) and the geometric information from the nDSM [63]. A hierarchical rule-based filtering solution was presented by Nardinocchi et al. [114]. LIDAR data was gridded and three classes—terrain, building and vegetation—were estimated by exploiting their geometric (height differences) and topological (spatial distribution) properties and their relationships among each other using region growing and local slope analysis. Employing hierarchical rules and fitting in 3D geometric primitives is very efficient for identifying objects in LIDAR data [114], provided that urban area is presented. However, this approach involves prior knowledge of the buildings and other manmade structures and becomes extremely difficult for complex objects. Furthermore, it is almost impossible to model vegetation geometrically. Hence, additional information such as CIR or Normalized Difference Vegetation Index (NDVI) has to be integrated.
Data fusion for land cover classification exploits complementary properties of both LIDAR and photogrammetry. It requires registration and orientation of gridded and interpolated LIDAR data with all spectral bands [136, 137]. Popular spectral data are NIR, red and green from CIR imagery [61] to be combined with height information from LIDAR data. Processing co-registered remotely sensed data involves solving a multivariate statistical problem. Typical classifiers are used such as ML, distance classifiers, Support Vector Machine (SVM), Principal Component Analysis (PCA), Independent Component Analysis (ICA) and Artificial Neural Networks (ANN). CIR imagery combined with LIDAR data can be used for detecting sealed and non-sealed surfaces for water waste management and council taxing [138] or for tree crown volume estimation. Further applications with fused LIDAR data and
400 |
H. Wei and M. Bartels |
aerial images are building reconstruction [131], building outline extraction with fused NDVIs from IKONOS multispectral panchromatic imagery [144], roof segmentation [23, 24], building detection based on Dempster-Shafer [132, 133], roofs, trees and grass classification using Gaussian Mixture Model (GMM) and Expectation Maximization (EM) [30], per-pixel minimum distance classification of streets, grass, trees, buildings and shadows [61], including empirical ground truth collection [118]. For data fusion applications, it is popular to use open source software such as the Geographic Resources Analysis Support System (GRASS) [27] or commercial software packages such as eCognition, Arc/Info, TerraSolid and TerraModeler [76, 152, 171]. In general, fusing LIDAR and additional space and airborne remotely sensed data has great potential for improving accuracy for land cover classification as it combines advantages of complementary bands. The gain, however, has a limit due to the curse of dimensionality in that an extensive adding of further bands may result in attenuation of accuracy [83]. Furthermore, it is non-trivial to find contemporary 2D GIS maps, imagery and ground truth. Moreover, the data has to be co-registered, and if necessary downsampled, gridded and interpolated due to different resolution and orientation. This process, however, omits data and adds additional non-existent information. The use of commercial software packages involves experimental tuning of the settings and most algorithms are hidden to the user. Furthermore, expensive licenses cannot always be purchased due to funding limitations.
Statistical classification filters are a means to segment and classify objects and terrain in LIDAR data in an unsupervized manner [168]. Cobby et al. [34] generated a DTM for flood simulation from LIDAR data recorded by the Environment Agency, UK. The authors segmented the rural area close to the River Severn and classified the objects into three vegetation height classes: short (crops and grasses), intermediate (hedges and shrubs) and tall (trees). Using a 5 × 5 window, the semiautomatic segmenter employed the standard deviation of local height as a feature because tall vegetation was assumed to constitute the highest objects [35]. For simplicity, the authors declared the limited number of houses in this area as high vegetation. Since the authors were interested in vegetation height, they separated the slightly hilly terrain from the actual object using detrending [39]. By subtracting the bilinear interpolated surface from the LIDAR data, an nDSM for a hydraulic model was obtained, as developed by [33]. Skewness balancing [19] and its adaptation to hilly terrain [17] are an alternative to object-based filtering. Yao et al. [175] have adapted skewness balancing to address the challenging task in detecting cars in LIDAR data. Bao et al. [15] further improved the algorithm by incorporating the measure of kurtosis, where the authors exploited the different changes of both statistical measures with respect to vegetation and ground. Advantages of statistical filtering algorithms are that they can work unsupervized directly on the original, non-gridded data. They fail, however, if the statistical model boundaries are insufficiently described.
A comparison of LIDAR filtering methods is illustrated in Table 9.5.
9 3D Digital Elevation Model Generation |
|
401 |
||
Table 9.5 Comparison of LIDAR filtering techniques |
|
|
||
|
|
|
|
|
Filtering technique |
Details |
Pros |
Cons |
Application |
|
|
|
|
|
Morphology |
Erosion, |
Robust, works |
Imprecise towards |
Forestry |
|
dilation, |
well for isolated |
little details, |
(canopy |
|
cleaning, filling, |
objects, DTM |
knowledge of |
modeling), |
|
watershed |
directly derived |
minimum |
flood |
|
algorithm |
from point cloud |
structure required |
modeling |
Slope-based |
Derivatives of |
Precise for |
Threshold |
Object |
|
slope |
discontinuities |
required, fails at |
detection |
|
(directional), |
|
mountainous, |
|
|
gradients, edge, |
|
highly sloped or |
|
|
corner, line |
|
completely flat |
|
|
detectors, |
|
terrain |
|
|
Laplacian, LoG |
|
|
|
Curvature-based |
Convex, |
Direct recognition |
Thresholds, |
Building |
|
concave, plane |
of structure in |
surfaces of man- |
detection |
|
hulls, Hough |
point cloud |
made objects only |
|
|
transform, TIN |
|
|
|
|
densification |
|
|
|
Geometry-based |
MDL, shape, |
Direct recognition |
Many prior |
Building |
|
length, width, |
of structure in |
parameters |
detection |
|
height, position, |
point cloud |
required, fails at |
|
|
orientation |
|
complex objects |
|
Linear |
Detrending, |
Robust against |
Threshold, |
DTM |
prediction |
robust linear |
sloped terrain |
weighting factors |
generation |
|
prediction |
|
|
|
Multi-resolution |
Gaussian, |
Robust, |
Computational |
DTM |
|
median |
separation of high |
costs and memory |
generation |
|
pyramids, |
and low |
requirements |
|
|
wavelets, |
frequencies |
|
|
|
hierarchical |
|
|
|
|
robust linear |
|
|
|
|
prediction |
|
|
|
Knowledge- |
3D primitives |
High quality |
Huge database |
Building |
based |
|
models |
required due |
detection |
|
|
|
complexity, |
|
|
|
|
vegetation |
|
|
|
|
difficult to model |
|
Data fusion |
ML, distance |
Combination of |
Co-registration, |
Land cover |
|
classifiers, |
complementary |
need for |
estimation, |
|
neural networks, |
advantages |
contemporary |
forestry |
|
PCA, ICA, SVM |
|
maps, curse of |
|
|
|
|
dimensionality |
|
Statistical |
Detrending, |
Unsupervised, |
Fails if model |
Object and |
classification |
Gaussian |
works on original |
boundaries are |
ground |
algorithms |
models, |
point clouds |
invalid |
point |
|
skewness |
|
|
separation |
|
balancing |
|
|
|
|
|
|
|
|