Papers:Texture:What-are-textons

From Dahuawiki

Jump to: navigation, search

Back to Texton Modeling

What are Textons?

S-C. Zhu, C-E. Guo, Y-Z. Wang, and Z-J. Xu
IJCV (2005)
<Download from Songchun Zhu's Home page>

Summary

This paper presents a generative model of textured images, which comprises three levels: image, bases, textons, from low to high.

In this paper, texton is defined as micro-structures in natural images. To model the formations of textured images, a three-level generative model is designed, in which an image is generated by superposition of bases selected from a overcomplete dictionary, which are in turn generated by a smaller number of textons. Textons are the geometric structure units in an image, the bases serve as intermediate vocabulary so that the textons can be turned into image parts using the terms therein.

Let T, B, I\, respectively denote the texton map, base map and image, and \Pi, \Phi\, denote the texton dictionary and base set, the probabilistic formulation is given by

P(I, B, T; \Theta) = p(I|B; \Phi)p(B|T; \Pi)p(T; \kappa) \,

Here,

p(I|B;\Phi)\, reflects the generation of image I from base map B as
I(u, v) \sim \mathcal{N}\left\{\sum_{i=1}^{n_B} \alpha_i \phi_{l_i}(u, v; B(i)), \sigma^2_0\right\},
in which the image is considered as superposition of transformed bases plus a noisy term.
Here, a transformed base is characterized by base mapping parameters
B(i) = \{l_i, \alpha_i, x_i, y_i, \tau_i, \sigma_i\}\,,
where the parameters are respectively the index of base prototype, superposition coefficient, x-translation, y-translation, rotation, and scaling.
And, B, the base map, is the set of all these mapping parameters, and the base prototypes are selected to be Laplacian-of-Gaussian, Gabor cosine, and Gabor sine.
P(B|T; \Pi)\, reflects the generation of base map B from the texton map T, in which the mapping is more complex. The mapping parameters involve texton index, photometric contrast, translation, rotation, scaling, and deformation.
P(T; \kappa)\, reflects the prior distribution of texton map.

To efficiently learn the parameters from\phi_{l_i}(u, v; B(i)) image sample, Data-Driven Markov Chain Monte Carlo (DDMCMC) method is employed, which has been shown to almost surely converge to globally optima.

The paper further extends the texton model to more sophisticated ones by taking dynamics and lighting variation into account:

  1. A moton model to describe the dynamic structures embeded in a sequence of images or video. The moton model extends the texton model by adding a moton level on top of the texton level. The textons tracked over image frames in combination with their trajectories are treated as moton elements. Markov chains are used to describe the state transition of a moton element during its life span. A moton dictionary is built from the extracted moton elements.
  2. A lighton model to describe the photometric structures under various lighting conditions. In this model, a texton represent a three-dimensional surface element under varying illumination conditions in form of a triplet of 2D textons. Then for a given light source, a lighton image can be produced by weighted combination of the three textons as in the illumination-cone representation.

My Comments

  • I really appreciate this paper. The model established in this paper hits the fundamentals of textons.
  • The experimental results clearly show that this model discovers meaningful units of textures, which is a significant progress over traditional works.
  • The weakness lies in the complexity of the algorithm. Though exisiting methods like DDMCMC may lead to more efficient implementation. However, it is still too complicated compared to classic methods like filtering-clustering. Actually, I think the latter can be regarded as simplified case of this model in some sense.




Back to Texture Modeling

Personal tools