Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
[2.1] 3D Imaging, Analysis and Applications-Springer-Verlag London (2012).pdf
Скачиваний:
12
Добавлен:
11.12.2021
Размер:
12.61 Mб
Скачать

Chapter 7

3D Shape Matching for Retrieval and Recognition

Benjamin Bustos and Ivan Sipiran

Abstract Nowadays, multimedia information, such as images and videos, are present in many aspects of our lives. Three-dimensional information is also becoming important in different applications, for instance entertainment, medicine, security, art, just to name a few. Therefore, it is necessary to study how to process 3D information in order to take advantage of the properties that it provides. This chapter gives an overview of 3D shape matching and its applications to 3D shape retrieval and 3D shape recognition. In order to present the subject, we describe in detail four approaches with a good balance of maturity and novelty across the different methods. The selected approaches are: the depth buffer descriptor, spin images, salient spectral geometric features and heat kernel signatures.

7.1 Introduction

The ability to store and manipulate large amounts of information has enabled the emergence of a number of applications. Generally, the information is given as text because it is easy to produce and understand by a computer. However, there are situations where information cannot be suitably represented by text, such as a tourist’s photo album.

Thus, the use and availability of multimedia information has increased considerably, on the one hand to support new applications and on the other hand due to the proliferation and accessibility of inexpensive capture devices such as digital cameras. A number of applications have benefited from this availability of images, videos and three-dimensional models, such as security, entertainment, and engineering processes. Unlike text, multimedia information is difficult to be compared directly and therefore we require techniques to manipulate it effectively.

This work has been supported by Fondecyt (Chile) Project 1110111.

B. Bustos ( ) · I. Sipiran

Department of Computer Science, University of Chile, Santiago, Chile e-mail: bebustos@dcc.uchile.cl

I. Sipiran

e-mail: isipiran@dcc.uchile.cl

N. Pears et al. (eds.), 3D Imaging, Analysis and Applications,

265

DOI 10.1007/978-1-4471-4063-4_7, © Springer-Verlag London 2012

 

266

B. Bustos and I. Sipiran

Our goal in this chapter is to address the problem of comparing two 3D objects based on their shapes. Shape matching refers to the process of finding a correspondence between two shapes based on some similarity criterion. The criterion which has received most attention by the research community is the visual similarity, that is to say, two shapes should be matched if they share visually common features. In this chapter, we are interested in two processes involving 3D shapes matching, namely retrieval and recognition.

Although these processes are related, they are slightly different. On the one hand, shape recognition consists of identifying a given 3D object within a set of possible alternatives. The outcome of this process is generally the class to which the object belongs or the recognized object as stored by the recognition system. The latter is useful when the queried object represents a partial or occluded portion such as a range scan. On the other hand, shape retrieval consists of defining a measure in order to quantify the similarity between shapes. A typical outcome of this process is an ordered list of objects according to their associated measure.

In recent years, we have witnessed increasing interest in the computer graphics and computer vision communities for 3D shape retrieval and recognition. As a result, a number of techniques and approaches have been proposed for shape representation, object and feature description, feature selection, and matching algorithms. It is also important to note the large number of applications that have emerged, such as:

Geometric 3D comparison for the shoe industry [79].

3D hippocampi retrieval [61].

Classification of pollen in 3D volume datasets [92].

3D retrieval for museums [49].

Human ear recognition in 3D [31].

3D object classification for craniofacial research [9].

3D protein retrieval and classification [87, 88, 112].

CAD/CAM [113].

Archeology [54].

3D video sequences [53].

To improve performance in each of these applications, it is important to evaluate and compare new approaches, as they appear, with existing ones. Due to the large amount of research carried out and the large dissemination of 3D information obtained through increasingly better capture devices, several benchmarks have been presented, such as the Princeton Shape Benchmark [96], the Konstanz 3D database [26], the TOSCA dataset [18], and the Shape Retrieval Contest (SHREC) datasets (see Fig. 7.1 with objects from different datasets). Such is the interest in 3D shape retrieval that, for several years now, there have been annual competitions with more tracks and more competitors year after year.

7 3D Shape Matching for Retrieval and Recognition

267

Fig. 7.1 Datasets provide shapes with several characteristics such as size, level of detail and shape classes. This figure shows some examples from

(a) TOSCA dataset, (b) Konstanz database, (c) SHREC dataset, and (d) Princeton shape benchmark

7.1.1 Retrieval and Recognition Evaluation

A very important consideration is the methodology of evaluation for shape retrieval and recognition techniques. In this sense, the evaluation relies on the solid background of mature fields, such as pattern recognition and information retrieval. In the case of shape retrieval, the common evaluation methods are Precision-Recall curves, cumulated gain-based measurements, nearest neighbor, first-tier and secondtier [10]. For shape recognition, Receiver Operating Characteristic (ROC) curves and error rates have commonly been used [38]. In order to keep this chapter relatively self-contained, we briefly describe the evaluation measures commonly used in the literature.

In retrieval, the evaluation measures mostly consider the ranked lists retrieved when query objects are presented to the system. The main measures are as follows:

Precision: the extent to which the retrieved objects are relevant to a query. Formally, it is the ratio of correctly retrieved objects with respect to the total number of retrieved objects.

Recall: the extent to which the relevant objects are correctly retrieved. Formally, it is the ratio of correctly retrieved objects with respect to the total number of relevant objects for the query.

R-Precision: precision when the number of retrieved objects equals the number of relevant objects.

Cumulated gain-based measurements: the idea is to use the ranked position of relevant objects. First, we need to convert the retrieved list into a binary list G with 1’s in position where relevant objects appear. Second, the discounted cumulated

268

B. Bustos and I. Sipiran

gain vector DCG is defined as follows:

 

 

G[1]

if i = 1

 

DCG[i] = DCG[i 1] + G[i]/ log2 i

otherwise

(7.1)

where the log function penalizes the ranking of a relevant object.

Nearest neighbor, first tier, and second tier: these evaluation measures check the ratio of relevant objects in the top K objects of the ranked list. Specifically, K = 1 for nearest neighbor, K = |R| − 1 for first tier and K = 2 × (|R| − 1) for second tier, where |R| is the number of relevant objects to a query.

In recognition, the common measures are related to error rates after recognition. Some interesting terms are defined as follows:

True positive (TP): False positive (FP): False negative (FN):

ratio of similar shapes identified as similar. ratio of dissimilar shapes identified as similar. ratio of similar shapes identified as dissimilar.

Using the aforementioned terms, the measures commonly used in recognition are:

Equal error rate (EER): the value of false positive at which it equals the false negative rate. Similarly, a typical measure is the value of false positive at some percentage of false negatives.

ROC curves: is a graphical plot representing the trade-off between the true positives and the false positives.

7.1.2 Chapter Outline

The organization of this chapter is as follows. Section 7.2 provides a brief overview of the proposed approaches and main ideas to date. Section 7.3 presents in detail a selection of techniques for shape retrieval and recognition. The organization of this section is as follows:

Section 7.3.1 describes a pioneering image-based technique in 3D shape retrieval known as the depth-buffer descriptor.

Section 7.3.2 presents spin-images, an effective descriptor for 3D shape recognition.

Section 7.3.3 presents an interesting approach for shape retrieval which uses the Laplace-Beltrami operator to define local descriptors for matching.

Finally, Sect. 7.3.4 addresses an approach that combines heat kernel signatures with the bag-of-features technique for 3D shape matching.

Section 7.4 is devoted to pose the main challenges in these fields and convenient directions for further research. Section 7.5 presents our concluding remarks. Section 7.6 suggests further readings and finally, in Sects. 7.7 and 7.8 we propose a selection of questions and exercises related to the presented material.