Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
[2.1] 3D Imaging, Analysis and Applications-Springer-Verlag London (2012).pdf
Скачиваний:
12
Добавлен:
11.12.2021
Размер:
12.61 Mб
Скачать

7 3D Shape Matching for Retrieval and Recognition

297

Table 7.7 Performance using space-sensitive bag of features with vocabulary size 48. Table reproduced from Ovsjanikov et al. [83]

Transformation

EER

FPR @ FNR = 1 %

FPR @ FNR = 0.1 %

Null

0.58 %

0.33 %

1.98 %

Isometry

1.00 %

1.07 %

6.16 %

Topology

1.12 %

1.67 %

4.77 %

Isometry+Topology

1.41 %

2.14 %

6.80 %

Triangulation

2.11 %

3.43 %

8.57 %

Partiality

3.70 %

6.19 %

8.52 %

All

1.44 %

1.79 %

11.09 %

 

 

 

 

On the other hand, using the space-sensitive approach, the results were 0.58 % for null shapes, 1.00 % for isometry, 1.12 % for topology, 2.11 % for triangulation, and 3.70 % for partiality; all in terms of EER. A recent track in the Shape Retrieval Contest (SHREC 2010) experimented on large scale retrieval [23], where the presented method and its variations were compared to other state-of-the-art techniques. In this report, heat kernel signatures obtained the best results with the mean average precision close to 100 % in almost all transformations, except partiality.

7.3.4.2 Complexity Analysis

Let S be a 3D object with n vertices. The computational complexity for each stage of the above method are:

Computation of the Laplacian matrix: O(n2).

Computation of the eigenvalues and eigenvectors: O(n3).

Computation of the HKS: O(nm), where m is the dimension of each HKS.

K-means clustering: O(I KN m), where I is the number of iterations until convergence, K is the number of clusters to be found, N is the number of descriptors of the entire collection, and m is the descriptor dimension.

Bag of features: O(N km).

The total complexity of this method is dominated by the clustering process. Obviously, the number of descriptors N can be extremely large, so the k-means clustering is expensive. Therefore, the total computational complexity of this method is

O(I KN m).

7.4 Research Challenges

If we observe the literature on shape retrieval and recognition as briefly reviewed in Sect. 7.2, we can observe that this is a relatively young field and therefore presents

298

B. Bustos and I. Sipiran

a number of areas which require further work and progress. This section is devoted to presenting the trends in future research and the challenges which concern the community.

Query specification. The research is commonly focused on testing the effectiveness and efficiency of the presented proposals, however an important factor is left out, users. As a result, little work has been done in query specification. It is generally assumed that we have a query object in the representation required by the application. Nevertheless, often we are interested in retrieving objects similar to the query, so a natural question arises: If we have an object (the query) visually similar to our needs, why do we proceed to search? A more interesting approach is to provide the query as images, video, sketches, or text. However, this proposal will often require human interaction to periodically feed back advisory information to the search. For example, in content-based image retrieval, much research has turned to using sketches as a more natural way of querying an image. This trend has raised new challenges and research interests which are also expected to emerge in the shape retrieval and recognition community.

Efficiency and large scale retrieval. Although a relative level of effectiveness has recently been achieved both in shape retrieval and recognition, important issues related to the efficiency require attention, even if approaches such as local features and the Laplace-Beltrami operator have begun to be extensively used. In addition, most techniques present results over publicly available datasets of no more than 2000 objects and results about efficiency are not even provided. Moreover, Laplace-Beltrami based approaches rely on extensive computations of eigenvalues and eigenvectors of huge matrices, so it is often necessary to simplify the meshes before processing at the expense of losing the level of detail. In this sense, efficient variants and alternatives are expected to be studied.

Object representation. As can be noted from previous sections of this chapter, many approaches rely on a boundary representation for 3D shapes. Perhaps this follows from the fact that this representation is preferred to others because its simplicity and suitability for rendering tasks. In addition, triangle meshes are widely available for processing and the vast majority of 3D objects on the Internet are found in this way. Nevertheless, some potential applications use different representations such as parametric surfaces in CAD and volumetric information in medicine. Each representation has intrinsic advantages which should be considered in order to exploit the information as it is.

Partial matching. A lot of work has been done for 3D objects when the required matching model is global, visual similarity. By global, we mean that given a 3D object, and algorithm retrieves those objects in the database that look visually similar and the whole shape structure is used for comparison. However, many presented methods do not allow partial matching due to the restricted global model that they assume. So given a part of a shape as a query, an interesting problem is to try to retrieve those objects in the database that contain visually similar parts to that query. Difficulties can arise due to the need to represent a model in a compact way, for instance, with local information whose extent is unknown a-priori. In addition, the

7 3D Shape Matching for Retrieval and Recognition

299

matching becomes an expensive task because of the exponential amount of possible memberships of the query. Moreover, an even harder problem is to quantify the similarity and partiality, since the similarity strongly depends of the level of partiality allowed while searching.

Domain applications. With the increasing interest of the computer vision community in 3D shape retrieval and recognition, a current trend is to research the support that these can give to high level vision tasks. What is more, computer vision aims at recognizing and understanding the real composition of a viewed scene through a camera, where a scene is part of a three-dimensional world. In the future, we could consider the combination of shape retrieval and recognition with 3D reconstructions of scenes from images as an attempt to break the semantic gap between a three-dimensional scene and the image which represents it. In the same way, the field of medicine could take advantage in building 3D image analysis automated systems such as magnetic resonance images (MRI) and computed tomographies. It is easy to obtain three-dimensional representations from this kind of information and further processing can be beneficial. Another interesting application is modeling support, such as is required in videogames, 3D films and special effects; all of these require a large amount of work in modeling. These applications could benefit from shape retrieval and recognition tasks to reduce the time spent modeling.

Automatic 3D objects annotation. In order to increase effectiveness, we may require more semantic information to complement the geometric information extracted from the shape. Information about composition is a good choice, so it is necessary to maintain textual information which represent rich semantic information to be used in retrieval tasks. Nevertheless, attaching tags to shapes by humans is an expensive task, taking into account the amount of objects in a database. Thus, by using shape retrieval and recognition we can assign textual tags based on visual similarity or functionality. In addition, this approach can be used to add semantic information, which can be used to improve the visual search effectiveness.

7.5 Concluding Remarks

This chapter introduces the 3D shape matching process from the point of view of representative approaches in the field, potential applications and the main challenges which need to be addressed. The wide variety of available 3D data allows us to choose between different characteristics such as level of detail, shapes classes, and so forth. In addition to the standard datasets, many shape recognition applications use custom-acquired data in order to test their proposals with respect to domainoriented information. However it is important to use datasets widely employed and accepted by the community to have consistent research results and valuable performance comparisons.

Just as the amount of available 3D data has considerably grown in recent times, there is also an increasing interest of researchers for proposing new approaches for shape matching and studying the potential applications in several fields. We have

300

B. Bustos and I. Sipiran

witnessed the achieved benefits of fields such as medicine, engineering, art, entertainment, and security, by the development of shape retrieval and recognition techniques. What is more, the interest in computer vision applications based on shape matching is becoming increasingly evident. It is easy to see the great potential that 3D information can provide and how it can be used to complement 2D images and video processing, in order to improve the effectiveness of high-level vision tasks. We believe that 3D information will be used commonly in the future and processes such as retrieval and recognition will be the basis for cutting-edge applications.

Likewise, it is beneficial to have a large catalog of techniques, because we can select an appropriate technique depending of the application context. Often we can combine techniques to improve the performance in general. In this chapter, we selected four techniques, which were explained in detail in Sect. 7.3. The depth buffer descriptor is an effective method based on extracting information of projections. Interestingly, this is one of the most effective methods yet simple, although it just supports a global similarity model. One way of supporting a certain level of partial similarity is by using local features extracted from shapes. Although the amount of information to be extracted increases, it is the cost to be paid for supporting nonglobal similarity models.

The other three presented techniques assume a non-global similarity model by extracting local descriptors which can be used to do the matching. The first of these is the spin image approach, which has proven to be effective in 3D shape recognition. Its versatility for describing shapes from different aspects has made it a standard technique for recognition tasks and new approaches often compare their results against results using spin images. Nevertheless, its dependency on uniform meshes and normals computation is restrictive. A small difference in calculating normals can produce different spin images, limiting its effectiveness.

Both salient spectral geometric features and heat kernel signature approaches make extensive use of a mathematical tool which has proven to be powerful for shape analysis, namely the Laplace-Beltrami operator. This operator has desirable properties which makes it a valuable tool for shape matching, in addition to the high effectiveness achieved in shape retrieval. Nevertheless, a weak point of this tool is its high computational cost which makes it an interesting challenge to be tackled in the future.

As can be noted, there is a lot of work to be done in proposing new approaches to improve the effectiveness and the efficiency of 3D shape matching, and studying new paradigms, some of which we mention in Sect. 7.4. We are convinced that the future of this research field is promising and the growth in scientific and technological productivity will remain thanks to the enormous efforts that the research communities in various fields are providing.

7.6 Further Reading

As expected, the increasing interest of research communities in shape retrieval and recognition has allowed a rapid advance, both in theory (new approaches) and appli-

7 3D Shape Matching for Retrieval and Recognition

301

cations. Obviously, due to space limitations, all the material could not be addressed in this chapter, so this section is devoted to present additional material for interested readers.

A good starting point to introduce the reader further to the subject of shape retrieval and recognition are the surveys [26, 28, 55, 100]. Early evaluations of algorithms were also presented in the reports [15, 25, 27, 42]. For recent experimentation with state-of-the-art techniques, we recommend the reports of the SHREC contest [1]. For instance, recent SHREC tracks are: robust correspondence benchmark [22], robust large-scale shape retrieval benchmark [23], robust feature detection and description benchmark [17, 21], non-rigid 3D shape retrieval [73, 74], and generic 3D warehouse [105]. These reports represent a good reference for reviewing recent approaches and their performance evaluation.

More advanced and recent approaches have been presented, such as retrieval of 3D articulated objects [4], retrieval by similarity score fusion [5], discriminative spherical wavelet features [65], spherical parameterizations [66], compact hybrid shape descriptor [85], matching of point set surfaces [93], probability density-based shape descriptor [6], and spin images for shape retrieval [7, 8, 37], just to name a few. Another interesting approach is to refine the retrieval results using user information about how relevant the results were with a certain query. This approach is commonly called relevance feedback and it was properly applied by Leng and Qin [70], and Giorgi et al. [48] in shape retrieval tasks.

As stated in Sect. 7.4, partial matching is a challenging and still open problem. Nevertheless, some attempts have been proposed in order to tackle this problem. Among the main approaches are objects as metric spaces [20], priority-driven search [43], shape topics [76], reeb pattern unfolding [102], partial matching for real textured 3D objects [58], regularized partial matching of rigid shapes [19], and matching of 3D shape subparts [78]. Additionally, a good reference for non-rigid shape retrieval is due to Bronstein et al. [24].

The use of machine learning techniques has also been involved in shape retrieval and recognition. For instance, the boosting approach [63], supervised learning of similarity measures [64], unsupervised learning approach [80], learning of semantic categories [81], learning of 3D face models [101], face recognition by SVM classification [13], and the neurofuzzy approach [68]. These approaches need some background in pattern recognition and machine learning theory.

For readers interested in the Laplace-Beltrami operator and its applications in shape retrieval and recognition, we recommend the papers by Belkin et al. [11, 12], Bobenko [16], Chuang et al. [34], Ghaderpanah et al. [46], Levy [71], Rustamov [95], Wu et al. [109], and Xu [110, 111]. These papers have highly mathematical content, so it is recommended for a more advanced level of research.

On the other hand, in addition to the applications listed in Sec. 7.1, in the papers by Perakis et al. [89], Zhou et al. [116] and Giorgi et al. [47], we can find applications to face recognition, and in the work by Wessel et al. [107], the authors presented a benchmark for retrieval of architectural data.