IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, DECEMBER 2007 1
Texture Analysis and Segmentation UsingModulation Features, Generative Models andWeighted Curve Evolution
Iasonas Kokkinos,
Member, IEEE,
Georgios Evangelopoulos,
Member, IEEE,
and Petros Maragos,
Fellow, IEEE
Abstract
—In this work we approach the analysis and segmentation of natural textured images by combining ideas from imageanalysis and probabilistic modeling. We rely on AMFM texturemodels and speciﬁcally on the Dominant Component Analysis(DCA) paradigm for feature extraction. This method provides alowdimensional, dense and smooth descriptor, capturing essential aspects of texture, namely scale, orientation, and contrast.Our contributions are at three levels of the texture analysis andsegmentation problems: First, at the feature extraction stagewe propose a
Regularized Demodulation Algorithm
that providesmore robust texture features and explore the merits of
modifying the channel selection criterion
of DCA. Second, we proposea probabilistic interpretation of DCA and Gabor ﬁltering ingeneral, in terms of
Local Generative Models
. Extending thispoint of view to edge detection facilitates the estimation of
posterior probabilities for the edge and texture classes
. Third, wepropose the
Weighted Curve Evolution
scheme that enhances curveevolutionbased segmentation methods by allowing for the locallyadaptive combination of heterogenous cues. Our segmentationresults are evaluated on the Berkeley Segmentation Benchmark,and compare favorably to current stateoftheart methods.
Index Terms
—Texture analysis, image segmentation, AMFMmodels, demodulation, generative models, curve evolution, cuecombination.
I. I
NTRODUCTION
T
EXTURE is ubiquitous in natural images and constitutesa powerful cue for a variety of image analysis andcomputer vision applications, like segmentation, shape fromtexture, and image retrieval. The advances of the last twodecades in image analysis and biological and computer visionhave deepened our understanding of this ﬁeld, yet it remainsopen and challenging.The problem of texture analysis has been addressed usingprimarily feature and modelbased methods; featurebasedmethods [2], [22], [30], [44], [47], [57] analyze texture usingan informative description that lends itself more easily tosubsequent tasks, typically using linear ﬁlterbanks as frontend systems. Members of the second category, like MarkovRandom Fields [8], [56] use tractable models for texture
Manuscript received May 25 2006; revised Feb. 28, 2007; accepted Nov.14, 2007. Recommended for acceptance by H. Shum.This research was supported by the Greek Ministry of Education underprogram ‘HRAKLEITOS’, the Greek Secretariat for Research & Technologyunder program ‘
Π
ENE
∆
2001’, and the European Network of Excellence‘MUSCLE’.I. Kokkinos was with the National Technical University of Athens when thispaper was ﬁrst submitted. He is currently with the University of Californiaat Los Angeles. G. Evangelopoulos and P. Maragos are with the NationalTechnical University of Athens.
patterns and formulate texture analysis as a parameter estimation task; the gap between these two approaches has beenbridged in [17], [56], yielding a powerful yet intricate commonframework. A different path has pursued the use of textons[23]; an operational deﬁnition of textons as cluster centers ina ﬁlter response space is advocated in [31], [37], while in[16], [17] a texton dictionary is proposed as a medium for theoptimal representation of images.These are powerful models for texture analysis, but theirappropriateness for unsupervised texture segmentation is limited in some respects. In conjunction with both boundarybased [31], [32], [37] and regionbased [30], [31], [44], [50],[57] approaches, the high dimensionality of ﬁlterbank featurescan lead to poor segmentations and requires dimensionalityreduction, which is a problem in itself. MRFbased approachessuffer from a computational aspect, since their ﬁtting is coupled with segmentation, resulting in a timeconsuming iterativeprocedure. Textonbased approaches ﬁt naturally with pairwiseclustering techniques [31], [50], where the proximity betweentwo pixels is estimated by comparing the distributions of texton indexes in their neighborhoods. However, such descriptorscannot be used by variational and generative segmentationmethods alike [1], [30], [44], [52], [57] that rely on havingsmooth features within homogeneous regions.Our approach builds on the class of Amplitude ModulationFrequency Modulation (AMFM) image models [18], [19],[34], and speciﬁcally the Dominant Component Analysismethod [20]. In short, DCA represents texture locally in termsof a single AMFM signal, whose parameters are estimatedand used as a texture descriptor. This yields a feature setthat encompasses information about texture contrast, scale, andorientation, while lending itself naturally to tasks like densityestimation used in image segmentation.In our work, whose preliminary versions have been presented in [11], [27]–[29], we pursue the construction of aconcise texture analysis and segmentation system for genericnatural images by extending the potential of the DCA method.Speciﬁcally, our contributions to texture analysis, feature interpretation and texture segmentation are as follows:
1) Feature Extraction:
In Sec. IIB a regularized algorithmfor demodulation is introduced, that avoids discrete imagedifferentiations using combinations of Gabor ﬁltering and the2D TeagerKaiser energy operator [34], [35]. The potentialof alternative criteria for channel selection based on the 2Doperator is explored in Sec. IIC, yielding features that aremore appropriate for segmentation.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, DECEMBER 2007 2
2) Probabilistic Analysis:
A probabilistic formulation of the AMFM channel selection procedure is presented in Sec.III, by modeling observations in terms of sinusoids and introducing locality in the likelihood expressions. This facilitatesthe interpretation of Gabor ﬁltering in terms of model ﬁtting,which is a formulation we also use in Sec. IIIC to phrase edgedetection in common terms with texture analysis. This laysthe ground for the probabilistic discrimination between edges,textured and smooth areas, which is a practically importantproblem for image segmentation.
3) Image Segmentation:
In Sec. IV we present an unsupervised segmentation scheme based on DCA features that usescurve evolution implemented with level set methods. Usingour probabilistic analysis results, we propose a method for thecombination of heterogenous cues that enhances the srcinalRegion Competition  Geodesic Active Regions evolution rule[44], [57]. Speciﬁcally, we introduce the Weighted CurveEvolution method that incorporates the posterior probabilitiesof the texture and edge classes in the evolution law. We reportsystematic experiments on the Berkeley benchmark, whereconsistent improvements in performance are attained whencompared to simpler or different segmentation methods.Since our contributions span different levels of the overallanalysis and segmentation system, each section is written in amodular manner, with introductory subsections on prior work and necessary background information.II. AMFM T
EXTURE
M
ODELS(a) (b) (c)(d) (e) (f)
Fig. 1: Textures of the locally narrowband type; Top row: (a)results of evolutionary processes, (b) surface deformations,(c) biological patterns. Bottom row: (d)(f) periodic manstructured objects.Locally narrowband signals can model a variety of textured images like patterns formed by surface deformations,orientationdiffusion biological markings as well as manmadeobjects exhibiting periodic structure, like those in Fig. 1. Modulation, or AMFM models, have been successfully applied tospeech signal analysis [4], [35], and are ideally suited for thedescription of such image signals [3], [34]. Modeling signalsin terms of nonstationary sinusoids,
f
(
x,y
) =
a
(
x,y
)cos(
φ
(
x,y
))
(1)AMFM models locally capture image contrast in terms of the
amplitude modulating signal
a
(
x,y
)
and image structure(scale and orientation) in terms of the
instantaneous frequencyvector:
ω
(
x,y
) =
φ
(
x,y
) =
∂φ∂x,∂φ∂y
(
x,y
)
.
(2)Even though many natural textures can be modeled interms of a monocomponent AMFM signal, images with 2Dstructure containing patterns like corners, crosses and junctionsnecessitate more than one components being simultaneouslypresent in the local image spectrum. The multicomponent AMFM model [19], [20] models an image
I
as the superposition of locally narrowband sinusoidal components
f
k
(
x,y
)
corruptedby a white Gaussian noise ﬁeld
w
(
x,y
)
:
I
(
x,y
) =
K
k
=1
a
k
(
x,y
)cos(
φ
k
(
x,y
))
f
k
(
x,y
)
+
w
(
x,y
)
.
(3)The fundamental problem of image demodulation aims atestimating for each of the
K
components the instantaneousamplitudes
a
k
(
x,y
)
and frequencies
ω
k
(
x,y
) =
φ
k
(
x,y
)
.The decomposition of an image in terms of this expressionis an illposed problem, due to the existence of an inﬁnityof modulating signal pairs and component superpositionssatisfying (3). Even if a separation of
I
in narrowbandcomponents
f
k
(
x,y
)
is known in advance, unavoidable modeling errors of any demodulation algorithm, the presence of noise, interference from neighbor spectral components, anddiscretization of the signal derivatives are possible sourcesof error in component estimation. Robustness in the AMFM demodulation problem can be achieved by consideringthe following problems:(P1) Reduction of the error in modeling each narrowbandcomponent
f
k
(
x,y
)
by a 2D AMFM signal while maintainingsmoothness in the estimated modulation signals.(P2) Suppression of noise.(P3) Suppression of neighbor spectral components whileestimating one component.(P4) Regularization of derivatives.Simultaneously achieving all the above goals is a complexoptimization task, which remains an unsolved problem. Inthe following subsections well established solutions to theproblems (P1)(P3) are presented, followed in subsection Bby a novel algorithm that jointly considers all problems. Insubsection C the DCA method is presented, together with amodiﬁed channel selection criterion that yields betterlocalizedfeatures.
A. AMFM Demodulation1) Energy Operators and Demodulation:
At the heart of problem P1 lies the fact there is an inﬁnite number of combinations that satisfy (1) for a given
f
. An efﬁcient scheme forthe demodulation of the narrowband components into smooth
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, DECEMBER 2007 3
modulating functions is provided by the multidimensional
Energy Separation Algorithm
(ESA) [34], that is based on ageneralization to higher dimensions of the 1D TeagerKaiserenergy operator [35]:
Ψ(
f
)(
x,y
)
f
(
x,y
)
2
−
f
(
x,y
)
2
f
(
x,y
)
.
(4)Let now
f
be a 2D spatial AMFM signal as in (1). Underrealistic assumptions [34], applying
Ψ
to
f
yields the energyproduct of the squared instantaneous amplitude and frequencymagnitude
Ψ[
a
cos(
φ
)]
≈
a
2

ω

2
,
(5)with an approximation error bounded within negligible range.This quantity may be interpreted as the component
modulationenergy
. Applying
Ψ
to the partial derivatives
f
x
=
∂f/∂x
,
f
y
=
∂f/∂y
, and combining all energies yields the 2Dcontinuous Energy Separation Algorithm [34]:
Ψ(
f
)
Ψ(
f
x
) + Ψ(
f
y
)
≈
a
(
x,y
)

(6)
Ψ(
f
x
)Ψ(
f
)
≈
ω
1
(
x,y
)

,
Ψ(
f
y
)Ψ(
f
)
≈
ω
2
(
x,y
)

,
(7)that can estimate at each location
(
x,y
)
the amplitude envelopeand the magnitudes of the instantaneous frequencies of thenonstationary AMFM signal. The signs of the frequencysignals can be implicitly obtained by the signs of the carrier,approximated by the ﬁlter central frequencies.
2) Multiband Gabor ﬁltering and Demodulation:
A simultaneous solution to problems (P2) and (P3) has been given in[3], [4] using a bank of bandpass ﬁlters densely covering thefrequency plane. The ﬁlterbanks used for this task are typically2D Gabor ﬁlters, favored due to their optimal joint spatialand spectral localization [14], [9]. Apart from componentdecoupling and robustness to noise, this approach speciﬁes inadvance the number and spectral localization of the differentcomponents, thereby constraining the decomposition of anygiven 2D signal to a ﬁxed component conﬁguration. In Fig. 2we show visually the ﬁlterbank used in our experiments, whiledetails are given in App. I.
Horizontal Frequency
V e r t i c a l F r e q u e n c y
Fig. 2: Filterbank grid on the 2D frequency domain. Contourscorrespond to halfpeak bandwidth magnitude.Demodulation via the ESA can be extended to the complex signals derived from convolution with complex Gaborﬁlters; the energy for a complexvalued signal
f
(
x,y
) =
a
(
x,y
)exp(
jφ
(
x,y
))
is deﬁned as
C
(
f
) = Ψ[Re
{
f
}
] + Ψ[Im
{
f
}
]
(8)and based on the approximation (5) the operator response is
C
[
f
]
≈
2
a
2

ω

2
. The averaging of operator responses resultsin smoother estimates of the modulating functions. Applying
C
to
f
=
I
∗
g
and its partial derivatives
f
x
,f
y
, results in ademodulation scheme where the frequencies are given by (7),and the amplitude by a slight modiﬁcation of (6):

a
(
x,y
)
≈
C
(
f
)
√
2
C
(
f
x
) +
C
(
f
y
)
.
(9)Another point is that Gabor ﬁltering imposes a speciﬁcdecomposition of an arbitrary signal of the form (3) into asum of narrowband components, with the frequency contentof each component localized around the corresponding Gaborﬁlter’s central frequency. However, the frequency content of the actual component may not be centered at the ﬁxed centralfrequency of the Gabor ﬁlter, thereby resulting in a suppressedestimate
a
k
of its amplitude,
A
k
. This can be compensated forby using the component’s estimated instantaneous frequency
ω
k
; speciﬁcally, if
G
k
(
·
)
is the frequency response of theGabor ﬁlter, the approximation
A
k
=
a
k

G
k
(
ω
k
)

(10)yields an amplitude estimate that is insensitive to deviationsfrom the corresponding ﬁlter central frequency [20].
B. Regularized Demodulation
A problem that emerges with ESA demodulation is thatthe signal derivatives can only be approximated using discretedifferentiation operations. As a result, the two differentialoperators entailed in the Energy Operator responses mayfurnish inaccurate amplitude and frequency estimates. In whatfollows we present a theoretically sound approach to alleviatethis problem, introducing a regularized 2D energy operatorand a related regularized 2D ESA.As analyzed in [46] for edge detection two
regularized solutions
to the derivative estimation problem, which minimizethe sum of the data approximation error and the energy of thesecond derivative of the approximating function, are (i) splineinterpolation and (ii) convolution of the image data by afunction that can be closely modeled by a Gaussian. In ourproblem which deals with narrowband but not necessarily lowpass signals, the Gaussian ﬁlter response must be modulatedby a sine with carrier equal to the spectral mean location of the signal. This yields a Gabor ﬁlter. In [10], the spline andthe Gabor regularization of the energy operator and the ESAwere compared for 1D signals, yielding a slight superiority of the Gabor ESA.Motivated by the above, we propose a
2D Gabor ESA
algorithm for simultaneous ﬁltering and demodulation. Let
I
(
x,y
)
be the continuous image,
g
(
x,y
)
the impulse of areal 2D Gabor ﬁlter and
f
(
x,y
) =
I
(
x,y
)
∗
g
(
x,y
)
itsoutput. Since convolution commutes with differentiation, the
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, DECEMBER 2007 4
continuous 2D energy operator combined with Gabor bandpassﬁltering becomes
Ψ(
f
) = Ψ(
I
∗
g
) =
I
∗
g
2
−
(
I
∗
g
)(
I
∗
2
g
)
.
(11)Thus, the differential operators have been replaced by ﬁlterderivatives which can be analytically estimated, thereby avoiding discretization errors.Similarly, for the estimation of the instantaneous amplitudeand frequency, the 2D Gabor ESA for demodulating
f
=
I
∗
g
consists of the following two steps:(1) Use the Gabor Energy Operator to compute the instantaneous energies of three image functions:
Ψ(
f
)
,
Ψ(
f
x
)
,
Ψ(
f
y
)
, where
Ψ(
f
x
) =
I
∗
g
x
2
−
(
I
∗
g
x
)(
I
∗
2
g
x
)
(12)(2) Use the evaluated energies in the formula of the 2Dcontinuous ESA.For all three energies we need seven Gabor differentialformulae:
g
x
,g
y
,g
xx
,g
yy
,g
xy
,
2
g
x
,
2
g
y
; the Gabor ESAis thus computationally more intensive since it requires moreconvolutions but adds robustness and improved performance.For efﬁciency we use an FFTbased frequencydomain implementation of the Gabor ESA, using the equation
F
∂
k
+
g∂x
k
∂y
=
F{
g
}
(
jω
x
)
k
(
jω
y
)
(13)relating the Fourier transforms
F{·}
of a signal and itsderivatives.
0 0.5 1 1.500.511.5
Frequency Magnitudes
ω
F
(
ω
)
GaborDerivativeDifference
0.60.810.30.6
Derivative− vs. Difference
F
(
ω
)
ω
DerivativeDifference
(a) (b) (c) (d)Fig. 3: Regularized Demodulation: (a) Representative AMFMsignal of the family (14) obtained for modulation index
α
=
.
5
and
log(
SNR
) = 6
. (b) Gabor ﬁlter used for demodulation(c) Fourier transform magnitudes for the ﬁlters involved in thealternative demodulation schemes, demonstrating the deviationof the central difference ﬁlter
∆
x
from the derivative operation
∂ ∂x
. (d) Deviation of
∆
x
∗
g
from
∂g∂x
in the frequency domain.In Table I the performance of the discrete ESA is comparedto the GaborESA scheme at varying degrees of noise andnonstationarity. Signals of the form
f
(
x,y
) = [1 +
αA
(
x,y
)]cos(
u
c
x
+
v
c
y
+
αθ
(
x,y
))
(14)
θ
(
x,y
) =14
2cos(
u
c
30
x
) + cos(
v
c
30
y
)
,
(15)
A
(
x,y
) = exp
−
x
2
+
y
2
10
(16)are used, where
u
c
,v
c
are the central frequencies of the Gaborﬁlter used for demodulation, shown in Fig. 3 (b). The signal isimmersed in white Gaussian noise at various Signal to NoiseTABLE I: Demodulation comparisons between GaborESAand discrete ESA.
<
(ˆ
A
−
A
)
2
>
for Gabor / discrete ESA (bold/plain)
log
SNR α
=0
α
=110
4
.
2 10
−
4
1
.
2 10
−
2
2
.
1 10
−
2
3
.
9 10
−
2
6
6
.
9 10
−
4
1
.
3 10
−
3
2
.
2 10
−
4
3
.
9 10
−
2
<
(ˆ
ω
x
−
ω
x
)
2
>
for Gabor / discrete ESA (bold/plain)
log
SNR α
=0
α
=110
6
.
4 10
−
5
1
.
9 10
−
2
4
.
9 10
−
3
2
.
2 10
−
2
6
1
.
1 10
−
4
1
.
9 10
−
2
4
.
9 10
−
3
2
.
3 10
−
2
Ratios (SNRs), while the index
α
is varied to produce differentdegrees of nonstationarity.For
α
= 0
, i.e. a stationary sinusoid, the approximationin (5) becomes exact, so for GaborESA the only source of error is noise. On the contrary, the differentiation scheme usedin the discreteESA introduces systematic errors as shownin Fig. 3(d) and results in inferior frequency and amplitudeestimates. For higher degrees of nonstationarity Gabor ESAsystematically yields better estimates, with the errors beingsolely due to the noise signal and the approximations of ESA.
C. Texture Features
The demodulation procedure furnishes a three dimensionalvector
(
A
k
,
φ
k
)(
x,y
)
for each of the components in (3), sodemodulating the ﬁlterbank channel outputs yields a
3
×
K
dimensional texture feature vector at each pixel. This multidimensional feature extraction scheme, termed ChannelizedComponent Analysis in [20], provides a rich image representation and can achieve accurate reconstructions of multicomponent signals; however, the high dimensionality of the featurevector may result in poor segmentations.A compact texture description can be extracted using theDominant Component Analysis method (
DCA
) [18], [20] thatretains the most prominent structure of the texture signal.Assuming that a single narrowband component dominates theﬁlter responses at pixel
(
x,y
)
, DCA selects pixelwise thechannel
i
(
x,y
)
that is closest to the component, demodulatesits output and uses the resulting AMFM features for texturerepresentation. The channel
i
(
x,y
)
is chosen among the
K
ﬁlter responses by maximizing a criterion
Γ
k
(
x,y
)
:
i
(
x,y
) =
arg
max
1
≤
k
≤
K
{
Γ
k
(
x,y
)
}
,
(17)
A
DCA
(
x,y
) =
A
i
(
x,y
)
(
x,y
)
, ω
DCA
(
x,y
) =
ω
i
(
x,y
)
(
x,y
)
.
(18)The choice of the dominant channel in the srcinal work onDCA has been based on the maximization of the estimatedamplitude envelopes:
Γ
k
(
x,y
) =

a
k
(
x,y
)

.
(19)In Fig. 4 a locally narrowband signal is used to demonstratethe structurecapturing properties of this procedure. A textondictionarybased method would break the image into piecesindicating which of the textons best match the input signal,yielding a discrete, textonindex tessellation of the image,
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, DECEMBER 2007 5Analysis
=
⇒
A
DCA
cos(
φ
DCA
)
DCA Synthesis
⇐
=
Multiband Demodulation
g
i
I
∗
g
i
A
i
cos(
φ
i
)
Fig. 4: Dominant Components Analysis method for a locally narrowband signal: a set of bandpass Gabor ﬁlters is initiallyused to isolate and demodulate the individual components of (3). The dominant channel is subsequently chosen at each imagelocation, and its AMFM parameters are used as a local texture descriptor. The principal structure of the textured signal is thuscaptured by the DCA parameters.while a ﬁlterbankbased feature descriptor would retain allﬁlter responses, even though most offer no complementaryinformation to that of the most active ﬁlter. On the otherhand, using the DCA method, a single ﬁlter is automaticallyselected and a low dimensional, smoothly varying featurevector is derived from it. Note that instead of the instantaneousfrequency measurements in Fig. 4 we use the phase estimatedelivered by the complex Gabor ﬁlter since it is better suitedfor visual display.The reﬁned frequency and amplitude estimates (7,10) furnished by the demodulation algorithm allow us to transcendfrom the quantized set of orientations and scales used by thefrontend ﬁlterbank to a continuous representation.
1) Energybased Dominant Component Analysis (EDCA):
As an alternative to amplitudebased dominant componentextraction, termed ADCA henceforth, we have considered anenergy channel selection criterion, based on the modulationproduct (5), leading to the Energybased Dominant ComponentAnalysis (EDCA) scheme. Intuitively, if we think of texturesignals as produced by physical oscillating sources in different scales and orientations, the selection of the dominantcomponent could be based on the maximumenergy sourcethat accounts for producing the local texture modulations.According to this scheme, modulation features are chosen fromthe ﬁlter output of dominating energy:
Γ
k
(
x,y
) = Ψ[(
I
∗
g
k
)(
x,y
)]
,
(20)where the complex energy operator (8) is used for a complexﬁlter
g
k
.Using the modulation energy for DCA results in improvedlocalization in texture and object boundaries: since the 2Denergy operator jointly captures contrast and frequency information in the modulation product (5), the scheme caneffectively consider channels with low amplitude (i.e. contrast)variations but high instantaneous frequency magnitude.To illustrate their differences, in Fig. 5 we compare thefeatures extracted using the srcinal and the alternative energybased method. Comparing the second and third columns,we see the EDCA measurements are sharper around objectboundaries, with improved localization and detail preservation.We observe for example that the diffusion effects around theborders of the tiger and the zebra are alleviated using EDCA.The reconstructions delivered by the two schemes reveal thepreservation of ﬁner structure in the energybased scheme; asan indicative example, notice that ADCA interprets the feetof the zebra as a slowly varying horizontal oscillation, whileEDCA focuses on the smaller scale structure of the verticalzebra skin pattern.We note here that the DCA model is designed primarily for1D features, like sinudoidal signals and requires additionalAMFM components to model 0D and 2D features like blobsand crosses respectively. It would be beneﬁcial to account forsuch patterns in our front end system, but we have practicallyobserved that, as seen also in Fig. 5, for images exhibiting suchpatterns a perceptually meaningful part of the image structureis captured by the DCA features.III. L
OCAL
G
ENERATIVE
M
ODELS FOR
T
EXTURE AND
E
DGES
In this section we justify probabilistically the channel selection of DCA, introducing a generative model that accountsfor the locality of the decision process. Based on this model