Texture analysis and segmentation using modulation features, generative models, and weighted curve evolution

of 16

Please download to get full document.

View again

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
16 pages
0 downs
Texture analysis and segmentation using modulation features, generative models, and weighted curve evolution
  IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, DECEMBER 2007 1 Texture Analysis and Segmentation UsingModulation Features, Generative Models andWeighted Curve Evolution Iasonas Kokkinos, Member, IEEE, Georgios Evangelopoulos, Member, IEEE, and Petros Maragos, Fellow, IEEE   Abstract —In this work we approach the analysis and segmen-tation of natural textured images by combining ideas from imageanalysis and probabilistic modeling. We rely on AM-FM texturemodels and specifically on the Dominant Component Analysis(DCA) paradigm for feature extraction. This method provides alow-dimensional, dense and smooth descriptor, capturing essen-tial aspects of texture, namely scale, orientation, and contrast.Our contributions are at three levels of the texture analysis andsegmentation problems: First, at the feature extraction stagewe propose a Regularized Demodulation Algorithm that providesmore robust texture features and explore the merits of  modifying the channel selection criterion of DCA. Second, we proposea probabilistic interpretation of DCA and Gabor filtering ingeneral, in terms of  Local Generative Models . Extending thispoint of view to edge detection facilitates the estimation of   posterior probabilities for the edge and texture classes . Third, wepropose the Weighted Curve Evolution scheme that enhances curveevolution-based segmentation methods by allowing for the locallyadaptive combination of heterogenous cues. Our segmentationresults are evaluated on the Berkeley Segmentation Benchmark,and compare favorably to current state-of-the-art methods.  Index Terms —Texture analysis, image segmentation, AM-FMmodels, demodulation, generative models, curve evolution, cuecombination. I. I NTRODUCTION T EXTURE is ubiquitous in natural images and constitutesa powerful cue for a variety of image analysis andcomputer vision applications, like segmentation, shape fromtexture, and image retrieval. The advances of the last twodecades in image analysis and biological and computer visionhave deepened our understanding of this field, yet it remainsopen and challenging.The problem of texture analysis has been addressed usingprimarily feature- and model-based methods; feature-basedmethods [2], [22], [30], [44], [47], [57] analyze texture usingan informative description that lends itself more easily tosubsequent tasks, typically using linear filterbanks as front-end systems. Members of the second category, like MarkovRandom Fields [8], [56] use tractable models for texture Manuscript received May 25 2006; revised Feb. 28, 2007; accepted Nov.14, 2007. Recommended for acceptance by H. Shum.This research was supported by the Greek Ministry of Education underprogram ‘HRAKLEITOS’, the Greek Secretariat for Research & Technologyunder program ‘ Π ENE ∆ -2001’, and the European Network of Excellence‘MUSCLE’.I. Kokkinos was with the National Technical University of Athens when thispaper was first submitted. He is currently with the University of Californiaat Los Angeles. G. Evangelopoulos and P. Maragos are with the NationalTechnical University of Athens. patterns and formulate texture analysis as a parameter esti-mation task; the gap between these two approaches has beenbridged in [17], [56], yielding a powerful yet intricate commonframework. A different path has pursued the use of textons[23]; an operational definition of textons as cluster centers ina filter response space is advocated in [31], [37], while in[16], [17] a texton dictionary is proposed as a medium for theoptimal representation of images.These are powerful models for texture analysis, but theirappropriateness for unsupervised texture segmentation is lim-ited in some respects. In conjunction with both boundary-based [31], [32], [37] and region-based [30], [31], [44], [50],[57] approaches, the high dimensionality of filterbank featurescan lead to poor segmentations and requires dimensionalityreduction, which is a problem in itself. MRF-based approachessuffer from a computational aspect, since their fitting is cou-pled with segmentation, resulting in a time-consuming iterativeprocedure. Texton-based approaches fit naturally with pairwiseclustering techniques [31], [50], where the proximity betweentwo pixels is estimated by comparing the distributions of tex-ton indexes in their neighborhoods. However, such descriptorscannot be used by variational and generative segmentationmethods alike [1], [30], [44], [52], [57] that rely on havingsmooth features within homogeneous regions.Our approach builds on the class of Amplitude Modulation-Frequency Modulation (AM-FM) image models [18], [19],[34], and specifically the Dominant Component Analysismethod [20]. In short, DCA represents texture locally in termsof a single AM-FM signal, whose parameters are estimatedand used as a texture descriptor. This yields a feature setthat encompasses information about texture contrast, scale, andorientation, while lending itself naturally to tasks like densityestimation used in image segmentation.In our work, whose preliminary versions have been pre-sented in [11], [27]–[29], we pursue the construction of aconcise texture analysis and segmentation system for genericnatural images by extending the potential of the DCA method.Specifically, our contributions to texture analysis, feature in-terpretation and texture segmentation are as follows: 1) Feature Extraction: In Sec. II-B a regularized algorithmfor demodulation is introduced, that avoids discrete imagedifferentiations using combinations of Gabor filtering and the2D Teager-Kaiser energy operator [34], [35]. The potentialof alternative criteria for channel selection based on the 2Doperator is explored in Sec. II-C, yielding features that aremore appropriate for segmentation.  IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, DECEMBER 2007 2 2) Probabilistic Analysis: A probabilistic formulation of the AM-FM channel selection procedure is presented in Sec.III, by modeling observations in terms of sinusoids and intro-ducing locality in the likelihood expressions. This facilitatesthe interpretation of Gabor filtering in terms of model fitting,which is a formulation we also use in Sec. III-C to phrase edgedetection in common terms with texture analysis. This laysthe ground for the probabilistic discrimination between edges,textured and smooth areas, which is a practically importantproblem for image segmentation. 3) Image Segmentation: In Sec. IV we present an unsuper-vised segmentation scheme based on DCA features that usescurve evolution implemented with level set methods. Usingour probabilistic analysis results, we propose a method for thecombination of heterogenous cues that enhances the srcinalRegion Competition - Geodesic Active Regions evolution rule[44], [57]. Specifically, we introduce the Weighted CurveEvolution method that incorporates the posterior probabilitiesof the texture and edge classes in the evolution law. We reportsystematic experiments on the Berkeley benchmark, whereconsistent improvements in performance are attained whencompared to simpler or different segmentation methods.Since our contributions span different levels of the overallanalysis and segmentation system, each section is written in amodular manner, with introductory subsections on prior work and necessary background information.II. AM-FM T EXTURE M ODELS(a) (b) (c)(d) (e) (f) Fig. 1: Textures of the locally narrowband type; Top row: (a)results of evolutionary processes, (b) surface deformations,(c) biological patterns. Bottom row: (d)-(f) periodic man-structured objects.Locally narrowband signals can model a variety of tex-tured images like patterns formed by surface deformations,orientation-diffusion biological markings as well as man-madeobjects exhibiting periodic structure, like those in Fig. 1. Mod-ulation, or AM-FM models, have been successfully applied tospeech signal analysis [4], [35], and are ideally suited for thedescription of such image signals [3], [34]. Modeling signalsin terms of non-stationary sinusoids, f  ( x,y ) = a ( x,y )cos( φ ( x,y )) (1)AM-FM models locally capture image contrast in terms of the amplitude modulating signal a ( x,y ) and image structure(scale and orientation) in terms of the instantaneous frequencyvector:  ω ( x,y ) =  φ ( x,y ) =  ∂φ∂x,∂φ∂y  ( x,y ) . (2)Even though many natural textures can be modeled interms of a monocomponent AM-FM signal, images with 2Dstructure containing patterns like corners, crosses and junctionsnecessitate more than one components being simultaneouslypresent in the local image spectrum. The multicomponent AM-FM model [19], [20] models an image I  as the superposition of locally narrowband sinusoidal components f  k ( x,y ) corruptedby a white Gaussian noise field w ( x,y ) : I  ( x,y ) = K  k =1 a k ( x,y )cos( φ k ( x,y ))        f  k ( x,y ) + w ( x,y ) . (3)The fundamental problem of image demodulation aims atestimating for each of the K  components the instantaneousamplitudes a k ( x,y ) and frequencies  ω k ( x,y ) =  φ k ( x,y ) .The decomposition of an image in terms of this expressionis an ill-posed problem, due to the existence of an infinityof modulating signal pairs and component superpositionssatisfying (3). Even if a separation of  I  in narrowbandcomponents f  k ( x,y ) is known in advance, unavoidable mod-eling errors of any demodulation algorithm, the presence of noise, interference from neighbor spectral components, anddiscretization of the signal derivatives are possible sourcesof error in component estimation. Robustness in the AM-FM demodulation problem can be achieved by consideringthe following problems:(P1) Reduction of the error in modeling each narrowbandcomponent f  k ( x,y ) by a 2D AM-FM signal while maintainingsmoothness in the estimated modulation signals.(P2) Suppression of noise.(P3) Suppression of neighbor spectral components whileestimating one component.(P4) Regularization of derivatives.Simultaneously achieving all the above goals is a complexoptimization task, which remains an unsolved problem. Inthe following subsections well established solutions to theproblems (P1)-(P3) are presented, followed in subsection Bby a novel algorithm that jointly considers all problems. Insubsection C the DCA method is presented, together with amodified channel selection criterion that yields better-localizedfeatures.  A. AM-FM Demodulation1) Energy Operators and Demodulation: At the heart of problem P1 lies the fact there is an infinite number of combi-nations that satisfy (1) for a given f  . An efficient scheme forthe demodulation of the narrowband components into smooth  IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, DECEMBER 2007 3 modulating functions is provided by the multidimensional  Energy Separation Algorithm (ESA) [34], that is based on ageneralization to higher dimensions of the 1D Teager-Kaiserenergy operator [35]: Ψ( f  )( x,y )   f  ( x,y )  2 − f  ( x,y )  2 f  ( x,y ) . (4)Let now f  be a 2D spatial AM-FM signal as in (1). Underrealistic assumptions [34], applying Ψ to f  yields the energyproduct of the squared instantaneous amplitude and frequencymagnitude Ψ[ a cos( φ )] ≈ a 2 ||  ω || 2 , (5)with an approximation error bounded within negligible range.This quantity may be interpreted as the component modulationenergy . Applying Ψ to the partial derivatives f  x = ∂f/∂x , f  y = ∂f/∂y , and combining all energies yields the 2Dcontinuous Energy Separation Algorithm [34]: Ψ( f  )   Ψ( f  x ) + Ψ( f  y ) ≈| a ( x,y ) | (6)   Ψ( f  x )Ψ( f  ) ≈| ω 1 ( x,y ) | ,   Ψ( f  y )Ψ( f  ) ≈| ω 2 ( x,y ) | , (7)that can estimate at each location ( x,y ) the amplitude envelopeand the magnitudes of the instantaneous frequencies of thenon-stationary AM-FM signal. The signs of the frequencysignals can be implicitly obtained by the signs of the carrier,approximated by the filter central frequencies. 2) Multiband Gabor filtering and Demodulation: A simul-taneous solution to problems (P2) and (P3) has been given in[3], [4] using a bank of bandpass filters densely covering thefrequency plane. The filterbanks used for this task are typically2D Gabor filters, favored due to their optimal joint spatialand spectral localization [14], [9]. Apart from componentdecoupling and robustness to noise, this approach specifies inadvance the number and spectral localization of the differentcomponents, thereby constraining the decomposition of anygiven 2D signal to a fixed component configuration. In Fig. 2we show visually the filterbank used in our experiments, whiledetails are given in App. I. Horizontal Frequency    V  e  r   t   i  c  a   l   F  r  e  q  u  e  n  c  y Fig. 2: Filterbank grid on the 2D frequency domain. Contourscorrespond to half-peak bandwidth magnitude.Demodulation via the ESA can be extended to the com-plex signals derived from convolution with complex Gaborfilters; the energy for a complex-valued signal f  ( x,y ) = a ( x,y )exp(  jφ ( x,y )) is defined as C  ( f  ) = Ψ[Re { f  } ] + Ψ[Im { f  } ] (8)and based on the approximation (5) the operator response is C  [ f  ] ≈ 2 a 2 ||  ω || 2 . The averaging of operator responses resultsin smoother estimates of the modulating functions. Applying C  to f  = I  ∗ g and its partial derivatives f  x ,f  y , results in ademodulation scheme where the frequencies are given by (7),and the amplitude by a slight modification of (6): | a ( x,y ) |≈ C  ( f  ) √ 2   C  ( f  x ) + C  ( f  y ) . (9)Another point is that Gabor filtering imposes a specificdecomposition of an arbitrary signal of the form (3) into asum of narrowband components, with the frequency contentof each component localized around the corresponding Gaborfilter’s central frequency. However, the frequency content of the actual component may not be centered at the fixed centralfrequency of the Gabor filter, thereby resulting in a suppressedestimate a k of its amplitude, A k . This can be compensated forby using the component’s estimated instantaneous frequency  ω k ; specifically, if  G k ( · ) is the frequency response of theGabor filter, the approximation A k = a k | G k (  ω k ) | (10)yields an amplitude estimate that is insensitive to deviationsfrom the corresponding filter central frequency [20].  B. Regularized Demodulation A problem that emerges with ESA demodulation is thatthe signal derivatives can only be approximated using discretedifferentiation operations. As a result, the two differentialoperators entailed in the Energy Operator responses mayfurnish inaccurate amplitude and frequency estimates. In whatfollows we present a theoretically sound approach to alleviatethis problem, introducing a regularized 2D energy operatorand a related regularized 2D ESA.As analyzed in [46] for edge detection two regularized solutions to the derivative estimation problem, which minimizethe sum of the data approximation error and the energy of thesecond derivative of the approximating function, are (i) splineinterpolation and (ii) convolution of the image data by afunction that can be closely modeled by a Gaussian. In ourproblem which deals with narrowband but not necessarily low-pass signals, the Gaussian filter response must be modulatedby a sine with carrier equal to the spectral mean location of the signal. This yields a Gabor filter. In [10], the spline andthe Gabor regularization of the energy operator and the ESAwere compared for 1D signals, yielding a slight superiority of the Gabor ESA.Motivated by the above, we propose a 2D Gabor ESA algorithm for simultaneous filtering and demodulation. Let I  ( x,y ) be the continuous image, g ( x,y ) the impulse of areal 2D Gabor filter and f  ( x,y ) = I  ( x,y ) ∗ g ( x,y ) itsoutput. Since convolution commutes with differentiation, the  IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, DECEMBER 2007 4 continuous 2D energy operator combined with Gabor bandpassfiltering becomes Ψ( f  ) = Ψ( I  ∗ g ) =  I  ∗ g  2 − ( I  ∗ g )( I  ∗ 2 g ) . (11)Thus, the differential operators have been replaced by filterderivatives which can be analytically estimated, thereby avoid-ing discretization errors.Similarly, for the estimation of the instantaneous amplitudeand frequency, the 2D Gabor ESA for demodulating f  = I  ∗ g consists of the following two steps:(1) Use the Gabor Energy Operator to compute the instan-taneous energies of three image functions: Ψ( f  ) , Ψ( f  x ) , Ψ( f  y ) , where Ψ( f  x ) =  I  ∗ g x  2 − ( I  ∗ g x )( I  ∗ 2 g x ) (12)(2) Use the evaluated energies in the formula of the 2Dcontinuous ESA.For all three energies we need seven Gabor differentialformulae: g x ,g y ,g xx ,g yy ,g xy ,  2 g x ,  2 g y ; the Gabor ESAis thus computationally more intensive since it requires moreconvolutions but adds robustness and improved performance.For efficiency we use an FFT-based frequency-domain imple-mentation of the Gabor ESA, using the equation F   ∂  k +  g∂x k ∂y   = F{ g } (  jω x ) k (  jω y )  (13)relating the Fourier transforms F{·} of a signal and itsderivatives. 0 0.5 1 1.500.511.5 Frequency Magnitudes ω     F             (     ω             )   GaborDerivativeDifference 0.60.810.30.6 Derivative− vs. Difference     F            (     ω            ) ω   DerivativeDifference (a) (b) (c) (d)Fig. 3: Regularized Demodulation: (a) Representative AM-FMsignal of the family (14) obtained for modulation index α = . 5 and log( SNR ) = 6 . (b) Gabor filter used for demodulation(c) Fourier transform magnitudes for the filters involved in thealternative demodulation schemes, demonstrating the deviationof the central difference filter ∆ x from the derivative operation ∂ ∂x . (d) Deviation of  ∆ x ∗ g from ∂g∂x in the frequency domain.In Table I the performance of the discrete ESA is comparedto the Gabor-ESA scheme at varying degrees of noise andnon-stationarity. Signals of the form f  ( x,y ) = [1 + αA ( x,y )]cos( u c x + v c y + αθ ( x,y )) (14) θ ( x,y ) =14  2cos( u c 30 x ) + cos( v c 30 y )  , (15) A ( x,y ) = exp  − x 2 + y 2 10  (16)are used, where u c ,v c are the central frequencies of the Gaborfilter used for demodulation, shown in Fig. 3 (b). The signal isimmersed in white Gaussian noise at various Signal to NoiseTABLE I: Demodulation comparisons between Gabor-ESAand discrete- ESA.  < (ˆ A − A ) 2 > for Gabor / discrete ESA (bold/plain) log SNR α =0 α =110 4 . 2 10 −   4 1 . 2 10 − 2 2 . 1 10 − 2 3 . 9 10 − 2 6 6 . 9 10 − 4 1 . 3 10 − 3 2 . 2 10 − 4 3 . 9 10 − 2  < (ˆ ω x − ω x ) 2 > for Gabor / discrete ESA (bold/plain) log SNR α =0 α =110 6 . 4 10 − 5 1 . 9 10 − 2 4 . 9 10 − 3 2 . 2 10 − 2 6 1 . 1 10 − 4 1 . 9 10 − 2 4 . 9 10 − 3 2 . 3 10 − 2 Ratios (SNRs), while the index α is varied to produce differentdegrees of non-stationarity.For α = 0 , i.e. a stationary sinusoid, the approximationin (5) becomes exact, so for Gabor-ESA the only source of error is noise. On the contrary, the differentiation scheme usedin the discrete-ESA introduces systematic errors as shownin Fig. 3(d) and results in inferior frequency and amplitudeestimates. For higher degrees of non-stationarity Gabor ESAsystematically yields better estimates, with the errors beingsolely due to the noise signal and the approximations of ESA. C. Texture Features The demodulation procedure furnishes a three dimensionalvector ( A k ,  φ k )( x,y ) for each of the components in (3), sodemodulating the filterbank channel outputs yields a 3 × K  -dimensional texture feature vector at each pixel. This mul-tidimensional feature extraction scheme, termed ChannelizedComponent Analysis in [20], provides a rich image represen-tation and can achieve accurate reconstructions of multicom-ponent signals; however, the high dimensionality of the featurevector may result in poor segmentations.A compact texture description can be extracted using theDominant Component Analysis method (  DCA ) [18], [20] thatretains the most prominent structure of the texture signal.Assuming that a single narrowband component dominates thefilter responses at pixel ( x,y ) , DCA selects pixelwise thechannel i ( x,y ) that is closest to the component, demodulatesits output and uses the resulting AM-FM features for texturerepresentation. The channel i ( x,y ) is chosen among the K  filter responses by maximizing a criterion Γ k ( x,y ) : i ( x,y ) = arg max 1 ≤ k ≤ K { Γ k ( x,y ) } , (17) A DCA ( x,y ) = A i ( x,y ) ( x,y ) ,  ω DCA ( x,y ) =  ω i ( x,y ) ( x,y ) . (18)The choice of the dominant channel in the srcinal work onDCA has been based on the maximization of the estimatedamplitude envelopes: Γ k ( x,y ) = | a k ( x,y ) | . (19)In Fig. 4 a locally narrowband signal is used to demonstratethe structure-capturing properties of this procedure. A textondictionary-based method would break the image into piecesindicating which of the textons best match the input signal,yielding a discrete, texton-index tessellation of the image,  IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, DECEMBER 2007 5Analysis = ⇒ A DCA cos( φ DCA ) DCA Synthesis ⇐ = Multiband Demodulation g i I  ∗ g i A i cos( φ i ) Fig. 4: Dominant Components Analysis method for a locally narrowband signal: a set of bandpass Gabor filters is initiallyused to isolate and demodulate the individual components of (3). The dominant channel is subsequently chosen at each imagelocation, and its AM-FM parameters are used as a local texture descriptor. The principal structure of the textured signal is thuscaptured by the DCA parameters.while a filterbank-based feature descriptor would retain allfilter responses, even though most offer no complementaryinformation to that of the most active filter. On the otherhand, using the DCA method, a single filter is automaticallyselected and a low dimensional, smoothly varying featurevector is derived from it. Note that instead of the instantaneousfrequency measurements in Fig. 4 we use the phase estimatedelivered by the complex Gabor filter since it is better suitedfor visual display.The refined frequency and amplitude estimates (7,10) fur-nished by the demodulation algorithm allow us to transcendfrom the quantized set of orientations and scales used by thefront-end filterbank to a continuous representation. 1) Energy-based Dominant Component Analysis (EDCA): As an alternative to amplitude-based dominant componentextraction, termed ADCA henceforth, we have considered anenergy channel selection criterion, based on the modulationproduct (5), leading to the Energy-based Dominant ComponentAnalysis (EDCA) scheme. Intuitively, if we think of texturesignals as produced by physical oscillating sources in dif-ferent scales and orientations, the selection of the dominantcomponent could be based on the maximum-energy sourcethat accounts for producing the local texture modulations.According to this scheme, modulation features are chosen fromthe filter output of dominating energy: Γ k ( x,y ) = Ψ[( I  ∗ g k )( x,y )] , (20)where the complex energy operator (8) is used for a complexfilter g k .Using the modulation energy for DCA results in improvedlocalization in texture and object boundaries: since the 2Denergy operator jointly captures contrast and frequency in-formation in the modulation product (5), the scheme caneffectively consider channels with low amplitude (i.e. contrast)variations but high instantaneous frequency magnitude.To illustrate their differences, in Fig. 5 we compare thefeatures extracted using the srcinal and the alternative energy-based method. Comparing the second and third columns,we see the EDCA measurements are sharper around objectboundaries, with improved localization and detail preservation.We observe for example that the diffusion effects around theborders of the tiger and the zebra are alleviated using EDCA.The reconstructions delivered by the two schemes reveal thepreservation of finer structure in the energy-based scheme; asan indicative example, notice that ADCA interprets the feetof the zebra as a slowly varying horizontal oscillation, whileEDCA focuses on the smaller scale structure of the verticalzebra skin pattern.We note here that the DCA model is designed primarily for1-D features, like sinudoidal signals and requires additionalAM-FM components to model 0-D and 2-D features like blobsand crosses respectively. It would be beneficial to account forsuch patterns in our front end system, but we have practicallyobserved that, as seen also in Fig. 5, for images exhibiting suchpatterns a perceptually meaningful part of the image structureis captured by the DCA features.III. L OCAL G ENERATIVE M ODELS FOR T EXTURE AND E DGES In this section we justify probabilistically the channel se-lection of DCA, introducing a generative model that accountsfor the locality of the decision process. Based on this model
Related Search
Similar documents
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks