Data-Driven Appearance Models for Computer Graphics
Over the last decade, the computer graphics community has witnessed a significant increase in the availability of measured (or non-parametric) appearance data. Although directly using these measurements to shade a surface can provide a level of realism that is difficult (if not impossible) to achieve with traditional analytic models, fully incorporating non-parametric appearance data into a traditional computer graphics pipeline still presents several research challenges.

The goal of our research is to develop new representations for non-parametric appearance data that address some of these open research challenges. Specifically, we have designed representations that support both interactive rendering and efficient sampling in the context of physically-based rendering. More recently, we have been investigating more general representations that allow a user to edit the variation in a high-dimensional function while retaining its fidelity to the original measurements.

Importance sampling is a common and effective technique for reducing the rendering time of standard Monte Carlo-based global-illumination algorithms. For measured (or tabular) surface reflectance functions (e.g. BRDFs), however, it is unclear how to efficiently generate samples of incident direction consistent with the distribution of energy of these functions. In a SIGGRAPH 2004 paper, we propose a new factored model of the BRDF that is designed to support efficient importance sampling within physically-based rendering systems. This is achieved by reparameterizing the BRDF domain before factoring the high-dimensional function using the Non-Negative Matrix Factorization (NMF) algorithm. Our final representation reduces rendering times by a factor of 4-10 for scenes that contain a variety of measured materials.

Although matrix factorization is appropriate for compressing BRDFs, not all functions are separable. We introduce a more general representation of high-dimensional measured data (again, in the context of physically-based rendering) optimized to provide compression and efficient importance sampling. Our approach is based on the Douglas-Peucker polyline approximation algorithm and achieves significant compression ratios of multi-dimensional datasets while providing a final representation that can be directly sampled. In our experiments, we show this representation provides ~4x decrease in the time it takes to render scenes with both complex materials and illumination.

Existing data-driven representations of appearance functions are compact, accurate and easy to use for rendering. Another crucial goal, which has so far received little attention, is editability: for practical use, we must be able to change both the directional and spatial behavior of surface reflectance (e.g., making one material shinier, another more anisotropic, and changing the spatial ``texture maps'' indicating where each material appears). We introduce the Inverse Shade Tree framework that provides a general approach to estimating the ``leaves'' of a user-specified shade tree from high-dimensional measured datasets of appearance. These leaves are sampled 1- and 2-dimensional functions that capture both the directional behavior of individual materials and their spatial mixing patterns. In order to compute these shade trees automatically, we map the problem to matrix factorization and introduce a flexible new algorithm that allows for constraints such as non-negativity, sparsity, and energy conservation.

Unlike (quasi-)homogeneous materials, the spatial component of heterogeneous subsurface scattering can be arbitrarily complex. Storing the spatial component outright results in impractically large datasets. We address the problem of acquiring and compactly representing the spatial component of heterogeneous subsurface scattering functions. A material model based on matrix factorization is proposed that can be mapped onto arbitrary geometry and, due to its compact form, can be incorporated into most visualization systems with little overhead. We use a projector and digital video camera to acquire several real-world datasets. We evaluate our representation in terms of both its qualitative and numerical accuracy.