Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
[2.1] 3D Imaging, Analysis and Applications-Springer-Verlag London (2012).pdf
Скачиваний:
12
Добавлен:
11.12.2021
Размер:
12.61 Mб
Скачать

8 3D Face Recognition

347

Then, when we wish to classify a 3D face scan (a probe), it must be projected into the new space. Therefore, we combine the two projections by multiplying them together, and each probe feature vector is mapped directly into the smaller subspace (of maximum dimension K 1) as:

x˜ p = WT VkT (xp x¯ ).

(8.38)

Although this approach can often give better results than PCA alone when there is enough training data within each class, a criticism of this two-stage approach is that the initial PCA stage could still discard dimensions that have useful discriminative information, as discussed earlier. Therefore, more recent approaches to applying LDA to high dimensional approaches have tried to avoid this and these techniques go under various names such as direct LDA. It is worth noting, however, that for some of these approaches there have been different viewpoints in the literature (e.g. in [94] and [36]) and we encourage the reader to investigate direct approaches after becoming comfortable with this more established two-stage approach.

8.8.4 LDA Performance

The work of Heseltine et al. [43] shows that LDA can give significantly better performance than PCA when multiple scans of the same subject are available in the training data, although this work pre-dates the wide use of benchmark FRGC 3D face data. As with PCA, the most computationally expensive process is usually pose normalization. Again, projection into a sub-space is a fast operation (linear in the dimension of the feature vector) and, in a nearest neighbor matching scheme, matching time is linear in the size of the gallery.

8.9 Normals and Curvature in 3D Face Recognition

When PCA, LDA and other techniques are applied to 3D face recognition problems, surface features are often extracted. The simplest of these are related to the differential properties of the surface, namely the surface normals and the surface curvature. In normal maps, each pixel value is represented by the surface normal. Gokbert et al. [39] used the normal vectors in Cartesian form (nx , ny , nz) and concatenated them to perform PCA-based 3D face recognition. Note that this form has redundancy and a more compact way is to use the spherical coordinates, (θ , φ), which are the elevation and azimuth angles respectively. Normals can be computed using the cross product on the mesh data, as described in Chap. 4, or we can fit a planar surface using orthogonal least squares to the spherical neighborhood of a 3D point or range pixel. This is implemented via SVD and the eigenvector with the smallest eigenvalue is the surface normal. Figure 8.9 shows sample images of normal maps of a 3D face.

348

A. Mian and N. Pears

Fig. 8.9 Top row: a range image (Z values) and its normal maps of elevation φ and azimuth θ angles. The three (Z, φ, θ ) values rendered as an RGB image. Bottom row: normal maps of x, y and z normal components. The three (X, Y, Z) values rendered as an RGB image

Surface normals can capture minor variations in the facial surface however, being first order derivatives, they are more sensitive to noise compared to depth maps. Often, to overcome this problem, the 3D face is smoothed before computing the normals, or the normals are computed over a larger neighborhood. In either case, the ability of surface normals to capture subtle features is somewhat attenuated.

The surface normals of a shape can be represented by points on a unit sphere. This sphere is often called the Gaussian sphere. By associating weights to the normals based on the surface area with the same normal, an extended Gaussian image (EGI) is formed [47]. The EGI cannot differentiate between similar objects at different scales which is not a major problem in 3D face recognition. Another limitation, which can impact face recognition, is that the EGI of only convex objects is unique and many non-convex objects can have the same EGI. To work around this limitation, Lee and Milios [55] represent only the convex regions of the face by EGI and use a graph matching algorithm for face recognition.

Curvature based measures, which are related to second-order derivatives of the raw depth measurements, have also been used to extract features from 3D face images and these measures are pose invariant. Several representations are prominent in this context, most of which are based on the principal curvatures of a point on a three dimensional surface. To understand principal curvatures, imagine the normal on a surface and an infinite set of planes (a pencil of planes) each of which contains this normal. Each of these planes intersects the surface in a plane curve and the principal curvatures are defined as the maximum curvature, κ1, and minimum curvature, κ2, of this infinite set of plane curves. The directions that correspond to maximum and minimum curvatures are always perpendicular and are called the principal directions of the surface. Principal curvatures are in fact the eigenvalues of the Weingarten matrix, which is a 2 × 2 matrix containing the parameters of a

8 3D Face Recognition

349

Fig. 8.10 Maximum (left) and minimum (right) principal curvature images of a 3D face

quadratic local surface patch, fitted in a local plane that is aligned to the surface tangent plane. Figure 8.10 shows images of the maximum and minimum curvatures of a 3D face. Tanaka et al. [85] constructed a variant of the EGI by mapping the principal curvatures and their directions onto two unit spheres representing ridges and valleys respectively. Similarity between faces was calculated by Fisher’s spherical correlation [33] of their EGIs.

Gaussian curvature, K , is defined as the product of these principal curvatures, while mean curvature, H , is defined as the average, i.e.

K

=

κ κ ,

H

=

κ1 + κ2

.

(8.39)

2

 

1 2

 

 

 

Both of these are invariant to rigid transformations (and hence pose), but only Gaussian curvature is invariant to the surface bending that may occur during changes of facial expression. Lee and Shim [56] approximated 3 × 3 windows of the range image by quadratic patches and calculated the minimum, maximum and Gaussian curvatures. Using thresholds, edge maps were extracted from these curvatures and a depth weighted Hausdorff distance was used to calculate the similarity between faces. Using depth values as weights in fact combines the range image with the curvatures giving it more discriminating power. The advantages of combining depth with curvature for face recognition have been known since the early 90’s [40].

The shape index was proposed by Koenderink and van Doorn [54] as a surface shape descriptor. It is based on both principal curvatures and derived as:

s

=

2

arctan

κ2

+ κ1

 

( 1

s

≤ +

1).

(8.40)

 

κ2 κ1

 

π

 

 

 

 

 

It can be thought of a polar description of shape in the κ1 κ2 plane, where different values distinguish between caps, cups, ridges, valleys and saddle points. Since a ratio of curvatures is used in Eq. (8.40), the size of the curvature is factored out and hence the descriptor is scale invariant. Koenderink and van Doorn combine the principal curvatures in a different measure to measure the magnitude of the curvature,

κ2+κ2

which they called curvedness, c, where c = 1 2 . Since, principal curvatures are

2

pose invariant, the shape index is also pose invariant. Lu et al. [59] used the shape index to find a rough estimate of registration which was then refined with a variant of the ICP algorithm [9]. In their earlier work, they also used the shape index map of the registered faces for recognition along with the texture and the Cartesian coordinates of the 3D face. This is an example where curvatures are combined with

350

A. Mian and N. Pears

Fig. 8.11 Facial curves mapped on the range image of a face [78]. These are the intersection of the 3D face with planes orthogonal to the camera, at different depths

the point cloud, instead of the range image, for face recognition. Thus we can conclude that curvatures offer viewpoint invariant and localized features that are useful for face alignment and matching. Moreover, face recognition performance generally improves when curvature based features are combined with the range image or point cloud.

Samir et al. [78] represented a 3D face with continuous facial curves which were extracted using a depth function. We can think of these curves as the intersection of the 3D face with planes orthogonal to the camera and at different depths. Face recognition was performed by matching their corresponding facial curves using geodesic distance criteria [53]. Although, the facial curves of a face change with changes in curvature of different identities, they are not completely invariant to pose [78]. Figure 8.11 shows a sample 3D face with facial curves.

8.9.1 Computing Curvature on a 3D Face Scan

Here we present a standard technique for computing curvatures on a 3D face scan. We assume that we start with a preprocessed mesh, which has spikes filtered and holes filled. Then, for each surface point, we implement the following procedure:

1.Find the neighbors within a local neighborhood, the neighbor set includes the point itself. The size of this neighborhood is a tradeoff that depends on the scan resolution and the noise level. We can use connectivity information in the mesh or the structure in a range image to compute neighborhoods quickly. Otherwise some form of data structuring of a point cloud is required to do a fast cuboidal region search, usually refined to a spherical region. Typically k-d trees and octrees are employed and standard implementations of these can be found online.

2.Zero-mean the neighbors and either use an eigendecomposition or SVD to fit a plane to those neighbors. The eigenvector with the smallest eigenvalue is the estimated surface normal. The other two eigenvectors lie in the estimated tangent plane and can be used as a local basis.

3.Project all neighbors into this local basis. (This is the same procedure as was outlined for a full face scan in Sect. 8.7.1.)

4.Recenter the data on the surface point.

5. Using least-squares fitting, fit a local quadratic surface patch, z = A2 x2 + Bxy + C2 y2, where [x, y, z]T are the neighboring points expressed in the recentered local basis and [A, B, C]T are the surface parameters to be found by least-squares.