A key issue in developing a shape-based retrieval
and analysis system is to find a computational representation of shape
(a shape descriptor) for which an index can be built, similarity
queries can be answered efficiently. We are studying a spectrum of shape
descriptors, ranging from ones that are simple to compute (but perhaps
not very discriminating) to ones that require expensive computations (but
provide sophisticated shape analysis):
On the low end of the spectrum, we have
been investigating shape distributions that represent the shape
of a 3D model as a probability distribution sampled from a shape function
measuring geometric properties of a 3D model. The motivation for this approach
is that samples can be computed quickly and easily, and the shape of the
resulting distributions are invariant to similarity transformations, noise,
tessellation, cracks, etc.(with a normalization step for scale invariance).
For example, one such shape distribution, which we call
the global shape of an object by the probability distribution of Euclidean
distances between pairs of randomly selected points on its surface.
The figure below shows the D2 distributions for 5 tanks (gray curves) and
6 cars (black curves). Note how the classes are distinguishable.
Reflective Symmetry Descriptors:
Computing reflective symmetries of 2D
and 3D shapes is a classical problem in computer vision and computational
geometry. Most prior work has focused on finding the main axes of symmetry,
or determining that none exists. We have studied a new shape descriptor
that represents a measure of reflective symmetry for an arbitrary 3D voxel
model for all planes through the model’s center of mass (even if they are
not planes of symmetry). For example, the figure to the right shows a car,
cube and chair (top) and their corresponding reflective symmetry descriptors
(bottom). The descriptors are drawn by scaling unit vectors on the sphere
in proportion to the measure of reflective symmetry about the plane through
the center of mass and normal to the vector.The main benefits of this new
shape descriptor are that it is defined over a canonical parameterization
(the sphere), which makes comparison simple, and it describes global symmetry
properties of a 3D shape, which capture significant semantic features of
One of the main challenges in matching
and indexing shapes is accounting for arbitrary rotations. Previous methods
that require alignment into a common coordinate system (e.g., with principal
axes) are not robust. Other methods that rely upon rotationally
invariant shape descriptors are usually not very discriminating.
We propose a novel rotation invariant shape-descriptor based on spherical
harmonics. The main idea is to decompose a 3D model into a collection
of functions defined on concentric spheres and to use spherical harmonics
to discard orientation information (phase) for each one. This yields a
shape descriptor that is both orientation invariant and descriptive.
On the high end of the spectrum, we are
investigating skeletons as a descriptor of shape. The general idea is to
derive 1D skeletal curves from a 3D object such that each curve represents
a significant part of the object. These curves are then converted
to an attributed graph representation (a
skeletal graph), which
can be used for indexing, matching, segmentation, correspondence finding,
etc. As an example, the figure below shows a plane model, its medial axis
(as computed using the “Powercrust” algorithm ), our voxelized centerline
representation, and the resulting skeletal graph. Note that the medial
axis comprises a multitude of noisy curves and sheets, while our centerline
representation nicely segments the plane into its salient parts.
Another issue is how should people specify
shape-based queries? For instance, consider a person who wants to
build a 3D virtual world representing a city scene. She will need
cars, street lamps, stop signs, etc. What input should be given to
a search engine to retrieve objects of these types from the World Wide
Web? We have investigated several options:
The simplest query interface is to search
for 3D models based on textual keywords. To support this feature, we construct
for each 3D model a representative document containing the model filename,
the anchor and nearby text parsed from its referring Web page, and ASCII
labels parsed from inside the model file. For each document, stop words
are removed, verbs are stemmed to remove inflectional changes, and synonyms
are added using WordNet [Miller95]. We match documents to user-specified
keywords using the TF-IDF/Rocchio method [Rocchio71], a popular weighting
and classification scheme for text documents.
We also support shape-based queries.
For instance, the user can provide an existing 3D model and ask our search
engine to retrieve similar ones. Alternately, the user may search
for models with shapes like one returned in a previous search by clicking
on the ``Find Similar Shape'' link under its image on a results page (blue
text in the figure below).
Of course, shape similarity queries are
only possible when the user already has a representative 3D model.
In some cases, he will be able to find one by using a text search.
However, in other cases, he will have to create it from scratch (at least
to seed the search). To investigate this option, we have developed
a query interface in which the user creates a simple 3D model with Teddy
[Igarashi99], and then the system returns objects with similar shapes.
For example, the figure below shows a cup sketched in Teddy and its five
An alternative approach is to allow the
user to draw one or more 2D shapes with a pixel paint program and then
have the system match the resulting image(s) to 2D projections of 3D objects.
For example, in the figures below, the user has drawn outline contours
specifying a shape, and the system has returned a set of matching objects.
The Princeton Shape Benchmark provides a repository of 3D models and software tools
for evaluating shape-based retrieval and analysis algorithms. The motivation is to
promote the use of standardized data sets and evaluation methods for research in matching,
classification, clustering, and recognition of 3D models. Researchers are encouraged
to use these resources to produce comparisons of competing algorithms in future
The benchmark contains a database of 3D polygonal models collected from the World Wide
Web. For each 3D model, there is a Object File Format (.off) file
with the polygonal geometry of the model, an ASCII text file
with information about the model (e.g., the URL from whence it came), and a
JPEG image file
with a thumbnail view of the model. Version 1 of the
benchmark contains 1,814 models. Source code for parsing the
files, viewing the models, and creating overview web pages is provided.
Princeton Segmentation Benchmark
The Princeton Segmentation Benchmark
provides data for quantitative analysis of how people decompose
objects into parts and for comparison of automatic mesh segmentation
algorithms. To build the benchmark, we recruited eighty people to
manually segment surface meshes into functional parts, yielding an
average of 11 human-generated segmentations for each of 380 meshes
across 19 object categories (shown in the figure above). This data set
provides a sampled distribution over ''how humans decompose each mesh
into functional parts,'' which we treat as a probabilistic ''ground
truth'' (darker lines in the image above show places where more people
placed a segmentation boundary). Given this data set, it is possible
to analyze properties of the human-generated segmentations to learn
about what they have in common with each other (and with
computer-generated segmentations) and to compute evaluation metrics
that measure how well the human-generated segmentations match
computer-generated ones for the same mesh.
Thomas Funkhouser, Michael Kazhdan, Philip Shilane, Patrick Min,
William Kiefer, Ayellet Tal, Szymon Rusinkiewicz, and David Dobkin,
Modeling by Example,"
ACM Transactions on Graphics (SIGGRAPH 2004), Los Angeles, CA, August 2004
Michael Kazhdan, Thomas Funkhouser, and Szymon Rusinkiewicz,
Shape Matching and Anisotropy,"
ACM Transactions on Graphics (SIGGRAPH 2004), Los Angeles, CA, August 2004
Thomas Funkhouser, Patrick Min, Michael Kazhdan,
Joyce Chen, Alex Halderman, David Dobkin, and David Jacobs,
A Search Engine for 3D Models,"
ACM Transactions on Graphics, 22(1), pp. 83-105, January 2003
Robert Osada, Thomas Funkhouser,
Bernard Chazelle, and David Dobkin,
ACM Transactions on Graphics, 21(4), pp. 807-832, October 2002
Michael Kazhdan, Bernard Chazelle,
David Dobkin, Adam Finkelstein, and Thomas Funkhouser,
A Reflective Symmetry Descriptor,"
European Conference on Computer Vision (ECCV), May, 2002