Change search
Refine search result
1 - 23 of 23
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Hongmei, Hu
    et al.
    ISVR, University of Southampton.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Lutman, Mark E
    ISVR, University of Southampton.
    Wang, Shouyan
    ISVR, University of Southampton.
    Sparsity level in a non-negative matrix factorization based speech strategy in cochlear implants2012In: 2012 Proceedings Of The 20th European Signal Processing Conference (EUSIPCO), IEEE Computer Society, 2012, p. 2432-2436Conference paper (Refereed)
    Abstract [en]

    Non-negative matrix factorization (NMF) has increasinglybeen used as a tool in signal processing in the last years, butit has not been used in the cochlear implants (CIs). Toimprove the performance of CIs in noisy environments, anovel sparse strategy is proposed by applying NMF onenvelops of 22 channels. In the new algorithm, the noisyspeech is first transferred to the time-frequency domain viaa 22- channel filter bank and the envelope in each frequencychannel is extracted; secondly, NMF is applied to theenvelope matrix (envelopegram); finally, the sparsitycondition is applied to the coefficient matrix to get moresparse representation. Speech reception threshold (SRT)subjective experiment was performed in combination withfive objective measurements in order to choose the properparameters for the sparse NMF model.

  • 2. Hu, H.
    et al.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Sang, J.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Azarpour, M.
    Dokku, R.
    Wang, S.
    Lutman, M. E.
    Bleeck, S.
    Speech enhancement via combination of Wiener filter and blind source separation2011In: Proceedings of the Sixth International Conference on Intelligent Systems and Knowledge Engineering, Shanghai, China  (ISKE2011), 2011, p. 485-494Conference paper (Refereed)
    Abstract [en]

    Automatic speech recognition (ASR) often fails in acoustically noisy environments. Aimed to improve speech recognition scores of an ASR in a real-life like acoustical environment, a speech pre-processing system is proposed in this paper, which consists of several stages: First, a convolutive blind source separation (BSS) is applied to the spectrogram of the signals that are pre-processed by binaural Wiener filtering (BWF). Secondly, the target speech is detected by an ASR system recognition rate based on a Hidden Markov Model (HMM). To evaluate the performance of the proposed algorithm, the signal-to-interference ratio (SIR), the improvement signal-to-noise ratio (ISNR) and the speech recognition rates of the output signals were calculated using the signal corpus of the CHiME database. The results show an improvement in SIR and ISNR, but no obvious improvement of speech recognition scores. Improvements for future research are suggested.

  • 3.
    Jalil, Taghia
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Arne, Leijon
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Variational Inference for Watson Mixture ModelIn: IEEE Transaction on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539Article in journal (Other academic)
    Abstract [en]

    This paper addresses modelling data using the multivariate Watson distributions. The Watson distribution is one of thesimplest distributions for analyzing axially symmetric data. This distribution has gained some attention in recent years due to itsmodeling capability. However, its Bayesian inference is fairly understudied due to difficulty in handling the normalization factor. Recentdevelopment of Monte-Carlo Markov chain (MCMC) sampling methods can be applied for this purpose. However, these methods canbe prohibitively slow for practical applications. A deterministic alternative is provided by variational methods that convert inferenceproblems into optimization problems. In this paper, we present a variational inference for Watson mixture model. First, the variationalframework is used to side-step the intractability arising from the coupling of latent states and parameters. Second, the variational freeenergy is further lower bounded in order to avoid intractable moment computation. The proposed approach provides a lower bound onthe log marginal likelihood and retains distributional information over all parameters. Moreover, we show that it can regulate its owncomplexity by pruning unnecessary mixture components while avoiding over-fitting. We discuss potential applications of the modelingwith Watson distributions in the problem of blind source separation, and clustering gene expression data sets.

  • 4. Ma, Zhanyu
    et al.
    Rana, Pravin Kumar
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Flierl, Markus
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Bayesian estimation of Dirichlet mixture model with variational inference2014In: Pattern Recognition, ISSN 0031-3203, E-ISSN 1873-5142, Vol. 47, no 9, p. 3143-3157Article in journal (Refereed)
    Abstract [en]

    In statistical modeling, parameter estimation is an essential and challengeable task. Estimation of the parameters in the Dirichlet mixture model (DMM) is analytically intractable, due to the integral expressions of the gamma function and its corresponding derivatives. We introduce a Bayesian estimation strategy to estimate the posterior distribution of the parameters in DMM. By assuming the gamma distribution as the prior to each parameter, we approximate both the prior and the posterior distribution of the parameters with a product of several mutually independent gamma distributions. The extended factorized approximation method is applied to introduce a single lower-bound to the variational objective function and an analytically tractable estimation solution is derived. Moreover, there is only one function that is maximized during iterations and, therefore, the convergence of the proposed algorithm is theoretically guaranteed. With synthesized data, the proposed method shows the advantages over the EM-based method and the previously proposed Bayesian estimation method. With two important multimedia signal processing applications, the good performance of the proposed Bayesian estimation method is demonstrated.

  • 5. Ma, Zhanyu
    et al.
    Teschendorff, Andrew E.
    Yu, Hong
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Guo, Jun
    Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis2014In: International Journal of Molecular Sciences, ISSN 1422-0067, E-ISSN 1422-0067, Vol. 15, no 6, p. 10835-10854Article in journal (Refereed)
    Abstract [en]

    As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance.

  • 6.
    Mohammadiha, Nasser
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Single channel speech enhancement using Bayesian NMF with recursive temporal updates of prior distributions2012In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012, IEEE conference proceedings, 2012, p. 4561-4564Conference paper (Refereed)
    Abstract [en]

    We present a speech enhancement algorithm which is based on a Bayesian Nonnegative Matrix Factorization (NMF). Both Minimum Mean Square Error (MMSE) and Maximum a-Posteriori (MAP) estimates of the magnitude of the clean speech DFT coefficients are derived. To exploit the temporal continuity of the speech and noise signals, a proper prior distribution is introduced by widening the posterior distribution of the NMF coefficients at the previous time frames. To do so, a recursive temporal update scheme is proposed to obtain the mean value of the prior distribution; also, the uncertainty of the prior information is governed by the shape parameter of the distribution which is learnt automatically based on the nonstationarity of the signals. Simulations show a considerable improvement compared to the maximum likelihood NMF based speech enhancement algorithm for different input SNRs.

  • 7.
    Rana, Pravin Kumar
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Ma, Zhanyu
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Flierl, Markus
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Multiview Depth Map Enhancement by Variational Bayes Inference Estimation of Dirichlet Mixture Models2013In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE , 2013, p. 1528-1532Conference paper (Refereed)
    Abstract [en]

    High quality view synthesis is a prerequisite for future free-viewpointtelevision. It will enable viewers to move freely in a dynamicreal world scene. Depth image based rendering algorithms willplay a pivotal role when synthesizing an arbitrary number of novelviews by using a subset of captured views and corresponding depthmaps only. Usually, each depth map is estimated individually bystereo-matching algorithms and, hence, shows lack of inter-viewconsistency. This inconsistency affects the quality of view synthesis negatively. This paper enhances the inter-view consistency ofmultiview depth imagery. First, our approach classifies the colorinformation in the multiview color imagery by modeling color witha mixture of Dirichlet distributions where the model parameters areestimated in a Bayesian framework with variational inference. Second, using the resulting color clusters, we classify the correspondingdepth values in the multiview depth imagery. Each clustered depthimage is subject to further sub-clustering. Finally, the resultingmean of each sub-cluster is used to enhance the depth imagery atmultiple viewpoints. Experiments show that our approach improvesthe average quality of virtual views by up to 0.8 dB when comparedto views synthesized by using conventionally estimated depth maps.

  • 8.
    Rana, Pravin Kumar
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Flier, Markus
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Statistical methods for inter-view depth enhancement2014In: 2014 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), IEEE , 2014, p. 6874755-Conference paper (Refereed)
    Abstract [en]

    This paper briefly presents and evaluates recent advances in statistical methods for improving inter-view inconsistency in multiview depth imagery. View synthesis is vital in free-viewpoint television in order to allow viewers to move freely in a dynamic scene. Here, depth image-based rendering plays a pivotal role by synthesizing an arbitrary number of novel views by using a subset of captured views and corresponding depth maps only. Usually, each depth map is estimated individually at different viewpoints by stereo matching and, hence, shows lack of inter-view consistency. This lack of consistency affects the quality of view synthesis negatively. This paper discusses two different approaches to enhance the inter-view depth consistency. The first one uses generative models based on multiview color and depth classification to assign a probabilistic weight to each depth pixel. The weighted depth pixels are utilized to enhance depth maps. The second one performs inter-view consistency testing in depth difference space to enhance the depth maps at multiple viewpoints. We comparatively evaluate these two methods and discuss their pros and cons for future work.

  • 9.
    Rana, Pravin Kumar
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Flierl, Markus
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    A Variational Bayesian Inference Framework for Multiview Depth Image Enhancement2012In: Proceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012, IEEE , 2012, p. 183-190Conference paper (Refereed)
    Abstract [en]

    In this paper, a general model-based framework for multiview depth image enhancement is proposed. Depth imagery plays a pivotal role in emerging free-viewpoint television. This technology requires high quality virtual view synthesis to enable viewers to move freely in a dynamic real world scene. Depth imagery of different viewpoints is used to synthesize an arbitrary number of novel views. Usually, the depth imagery is estimated individually by stereo-matching algorithms and, hence, shows lack of inter-view consistency. This inconsistency affects the quality of view synthesis negatively. This paper enhances the inter-view consistency of multiview depth imagery by using a variational Bayesian inference framework. First, our approach classifies the color information in the multiview color imagery. Second, using the resulting color clusters, we classify the corresponding depth values in the multiview depth imagery. Each clustered depth image is subject to further subclustering. Finally, the resulting mean of the sub-clusters is used to enhance the depth imagery at multiple viewpoints. Experiments show that our approach improves the quality of virtual views by up to 0.25 dB.

  • 10.
    Rana, Pravin Kumar
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Communication Theory. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Ma, Zhanyu
    Flierl, Markus
    KTH, School of Electrical Engineering (EES), Communication Theory. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Probabilistic Multiview Depth Image Enhancement Using Variational Inference2015In: IEEE Journal on Selected Topics in Signal Processing, ISSN 1932-4553, E-ISSN 1941-0484, Vol. 9, no 3, p. 435-448Article in journal (Refereed)
    Abstract [en]

    An inference-based multiview depth image enhancement algorithm is introduced and investigated in this paper. Multiview depth imagery plays a pivotal role in free-viewpoint television. This technology requires high-quality virtual view synthesis to enable viewers to move freely in a dynamic real world scene. Depth imagery of different viewpoints is used to synthesize an arbitrary number of novel views. Usually, the depth imagery is estimated individually by stereo-matching algorithms and, hence, shows inter-view inconsistency. This inconsistency affects the quality of view synthesis negatively. This paper enhances the multiview depth imagery at multiple viewpoints by probabilistic weighting of each depth pixel. First, our approach classifies the color pixels in the multiview color imagery. Second, using the resulting color clusters, we classify the corresponding depth values in the multiview depth imagery. Each clustered depth image is subject to further subclustering. Clustering based on generative models is used for assigning probabilistic weights to each depth pixel. Finally, these probabilistic weights are used to enhance the depth imagery at multiple viewpoints. Experiments show that our approach consistently improves the quality of virtual views by 0.2 dB to 1.6 dB, depending on the quality of the input multiview depth imagery.

  • 11.
    Taghia, Jalal
    et al.
    Institute of Communication Acoustics, Ruhr-Universität Bochum.
    Martin, Rainer
    Institute of Communication Acoustics, Ruhr-Universität Bochum.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    An Investigation on Mutual Information for the Linear Predictive System and the Extrapolation of Speech Signals.2012In: Speech Communication; 10. ITG Symposium; Proceedings of, 2012, p. 1-4Conference paper (Refereed)
    Abstract [en]

    Mutual information (MI) is an important information theoretic concept which has many applications in telecommunications, in blind source separation, and in machine learning. More recently, it has been also employed for the instrumental assessment of speech intelligibility where traditionally correlation based measures are used. In this paper, we address the difference between MI and correlation from the viewpoint of discovering dependencies between variables in the context of speech signals. We perform our investigation by considering the linear predictive approximation and the extrapolation of speech signals as examples. We compare a parametric MI estimation approach based on a Gaussian mixture model (GMM) with the knearest neighbor (KNN) approach which is a well-known non-parametric method available to estimate the MI. We show that the GMM-based MI estimator leads to more consistent results.

  • 12. Taghia, Jalal
    et al.
    Martin, Rainer
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Dual-channel noise reduction based on a mixture of circular-symmetric complex Gaussians on unit hypersphere2013In: ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, 2013, p. 7289-7293Conference paper (Refereed)
    Abstract [en]

    In this paper a model-based dual-channel noise reduction approach is presented which is an alternative to conventional noise reduction algorithms essentially due to its independence of the noise power spectral density estimation and of any prior knowledge about the spatial noise field characteristics. We use a mixture of circular-symmetric complex-Gaussian distributions projected on the unit hypersphere for modeling the complex discrete Fourier transform coefficients of noisy speech signals in the frequency domain. According to the derived mixture model, clustering of the noise and the target speech components is performed depending on their direction of arrival. A soft masking strategy is proposed for speech enhancement based on responsibilities assigned to the target speech class in each time-frequency bin. Our experimental results show that the proposed approach is more robust than conventional dual-channel noise reduction systems based on the single- and dual-channel noise power spectral density estimators.

  • 13.
    Taghia, Jalal
    et al.
    Ruhr University of Bochum.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Sang, Jinqiu
    University of Southampton.
    Bouse, Vaclav
    Siemens.
    Martin, Rainer
    Ruhr University of Bochum.
    An Evaluation of noise power spectral density estimation algorithms in adverse acoustic environments2011In: 36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, IEEE , 2011, p. 4640-4643Conference paper (Refereed)
    Abstract [en]

    Noise power spectral density estimation is an important componentof speech enhancement systems due to its considerable effect onthe quality and the intelligibility of the enhanced speech. Recently,many new algorithms have been proposed and significant progressin noise tracking has been made.In this paper, we present an evaluation framework for measuringthe performance of some recently proposed and some well-knownnoise power spectral density estimators and compare their performancein adverse acoustic environments. In this investigation we donot only consider the performance in the mean of a spectral distancemeasure but also evaluate the variance of the estimators as the latteris related to undesirable fluctuations also known as musical noise.By providing a variety of different non-stationary noises, the robustnessof noise estimators in adverse environments is examined.

  • 14.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Bayesian Modeling of Directional Data with Acoustic and Other Applications2014Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    A direction is defined here as a multi-dimensional unit vector. Such unitvectors form directional data. Closely related to directional data are axialdata for which each direction is equivalent to the opposite direction.Directional data and axial data arise in various fields of science. In probabilisticmodeling of such data, probability distributions are needed whichcount for the structure of the space from which data samples are collected.Such distributions are known as directional distributions and axial distributions.This thesis studies the von Mises-Fisher (vMF) distribution and the(complex) Watson distribution as representatives of directional and axialdistributions.Probabilistic models of the data are defined through a set of parameters.In the Bayesian view to uncertainty, these parameters are regarded as randomvariables in the learning inference. The primary goal of this thesis is todevelop Bayesian inference for directional and axial models, more precisely,vMF and (complex) Watson distributions, and parametric mixture modelsof such distributions. The Bayesian inference is realized using a family ofoptimization methods known as variational inference. With the proposedvariational methods, the intractable Bayesian inference problem is cast asan optimization problem.The variational inference for vMF andWatson models shall open up newapplications and advance existing application domains by reducing restrictiveassumptions made by current modelling techniques. This is the centraltheme of the thesis in all studied applications. Unsupervised clustering ofgene-expression and gene-microarray data is an existing application domain,which has been further advanced in this thesis. This thesis also advancesapplication of the complex Watson models in the problem of blind sourceseparation (BSS) with acoustic applications. Specifically, it is shown thatthe restrictive assumption of prior knowledge on the true number of sourcescan be relaxed by the desirable pruning property in Bayesian learning, resultingin BSS methods which can estimate the number of sources.Furthermore, this thesis introduces a fully Bayesian recursive frameworkfor the BSS task. This is an attempt toward realization of an online BSSmethod. In order to reduce the well-known problem of permutation ambiguityin the frequency domain, the complete BSS problem is solved in one unified modeling step, combining the frequency bin-wise source estimationwith the permutation problem. To realize this, all time frames and frequencybins are connected using a first order Markov chain. The model cancapture dependencies across both time frames and frequency bins, simultaneously,using a feed-forward two-dimensional hidden Markov model (2-DHMM).

  • 15.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Doostari, Mohammad Ali
    Shahed University, Khalije Fars Highway, Teheran, Iran.
    Subband-based Single-channel Source Separation of Instantaneous Audio Mixtures2009In: World Applied Sciences Journal, ISSN 1818-4952, E-ISSN 1991-6426, Vol. 6, no 6, p. 784-792Article in journal (Refereed)
    Abstract [en]

    In this paper, a new algorithm is developed to separate the audio sources from a single instantaneous mixture. The algorithm is based on subband decomposition and uses a hybrid system of Empirical Mode Decomposition (EMD) and Principle Component Analysis (PCA) to construct artificial observations from the single mixture. In the separation stage of algorithm, we use Independent Component Analysis (ICA) to find independent components. At first the observed mixture is divided into a finite number of subbands through filtering with a parallel bank of FIR band-pass filters. Then EMD is employed to extract Intrinsic Mode Functions (IMFs) in each subband. By applying PCA to the extracted components, we find uncorrelated components which are the artificial observations. Then we obtain independent components by applying Independent Component Analysis (ICA) to the uncorrelated components. Finally, we carry out subband synthesis process to reconstruct fullband separated signals. The experimental results substantiate that the proposed method truly performs the task of source separation from a single instantaneous mixture.

  • 16. Taghia, Jalil
    et al.
    Doostari, Mohammad Ali
    Taghia, Jalal
    An Image Watermarking Method Based on Bidimensional Empirical Mode Decomposition2008In: CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 5, PROCEEDINGS / [ed] Li, D; Deng, G, China, 2008, p. 674-678Conference paper (Refereed)
  • 17.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Gerkmann, Timo
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Blind Source Separation of Nondisjoint Sources in The Time-Frequency Domain with Model-Based Determination of Source Contribution2011In: 2011 IEEE International Symposium On Signal Processing And Information Technology (ISSPIT), New York: IEEE , 2011, p. 276-280Conference paper (Refereed)
    Abstract [en]

    While most blind source separation (BSS) algorithms rely on the assumption that at most one source is dominant at each time-frequency (TF) point, recently, two BSS approaches, [1], [2], have been proposed that allow multiple active sources at time-frequency (TF) points under certain assumptions. In both algorithms, the active sources in every single TF point are found by an exhaustive search through an optimization procedure which is computationally expensive. In this work, we address this limitation and avoid the exhaustive search by determining the source contribution in every TF point. The source contributions are expressed by a set of posterior probabilities. Hereby, we propose a model-based blind source separation algorithm that allows sources to be nondisjoint in the TF domain while being computationally more tractable. The proposed BSS approach is shown to be robust with respect to different reverberation times and microphone spacings.

  • 18.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Bayesian Recursive Blind Source SeparationIn: Journal of machine learning research, ISSN 1532-4435, E-ISSN 1533-7928Article in journal (Other academic)
    Abstract [en]

    We consider the problem of blind source separation (BSS) of convolutive mixtures in underdeterminedscenarios, where there are more sources to estimate than recorded signals. This problemhas been intensively studied in the literature. Many successful methods relay on batch processingof previously recorded signals, and hence are only best suited for noncausal systems. This paperaddresses the problem of online BSS. To realize this, we develop a Bayesian recursive framework.The proposed Bayesian framework allows incorporating prior knowledge in a coherentway, and therecursive learning allows to combine information gained from the current observation with all informationfromthe previous observations. Experiments using live audio recordings show promisingresults.

  • 19.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Separation of Unknown Number of Sources2014In: IEEE Signal Processing Letters, ISSN 1070-9908, E-ISSN 1558-2361, Vol. 21, no 5, p. 625-629Article in journal (Refereed)
    Abstract [en]

    We address the problem of blind source separation in acoustic applications where there is no prior knowledge about the number of mixing sources. The presented method employs a mixture of complex Watson distributions in its generative model with a sparse Dirichlet distribution over the mixture weights. The problem is formulated in a fully Bayesian inference with assuming prior distributions over all model parameters. The presented model can regulate its own complexity by pruning unnecessary components by which we can possibly relax the assumption of prior knowledge on the number of sources.

  • 20.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES).
    Variational Inference for Watson Mixture Model2016In: IEEE Transaction on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 38, no 9, p. 1886-1900Article in journal (Refereed)
    Abstract [en]

    This paper addresses modelling data using the Watson distribution. The Watson distribution is one of the simplest distributions for analyzing axially symmetric data. This distribution has gained some attention in recent years due to its modeling capability. However, its Bayesian inference is fairly understudied due to difficulty in handling the normalization factor. Recent development of Markov chain Monte Carlo (MCMC) sampling methods can be applied for this purpose. However, these methods can be prohibitively slow for practical applications. A deterministic alternative is provided by variational methods that convert inference problems into optimization problems. In this paper, we present a variational inference for Watson mixture models. First, the variational framework is used to side-step the intractability arising from the coupling of latent states and parameters. Second, the variational free energy is further lower bounded in order to avoid intractable moment computation. The proposed approach provides a lower bound on the log marginal likelihood and retains distributional information over all parameters. Moreover, we show that it can regulate its own complexity by pruning unnecessary mixture components while avoiding over-fitting. We discuss potential applications of the modeling with Watson distributions in the problem of blind source separation, and clustering gene expression data sets.

  • 21.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Ma, Z.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    On von-Mises Fisher mixture model in Text-independent speaker identification2013In: Proceedings of the 2013 INTERSPEECH, 2013, p. 2499-2503Conference paper (Refereed)
    Abstract [en]

    This paper addresses text-independent speaker identification (SI) based on line spectral frequencies (LSFs). The LSFs are transformed to differential LSFs (MLSF) in order to exploit their boundary and ordering properties. We show that the square root of MLSF has interesting directional characteristics implying that their distribution can be modeled by a mixture of von-Mises Fisher (vMF) distributions. We analytically estimate the mixture model parameters in a fully Bayesian treatment by using variational inference. In the Bayesian inference, we can potentially determine the model complexity and avoid overfitting problem associated with conventional approaches based on the expectation maximization. The experimental results confirm the effectiveness of the proposed SI system.

  • 22.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Ma, Zhanyu
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Bayesian Estimation of the von-Mises Fisher Mixture Model with Variational Inference2014In: IEEE Transaction on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 36, no 9, p. 1701-1715Article in journal (Refereed)
    Abstract [en]

    This paper addresses the Bayesian estimation of the von-Mises Fisher (vMF) mixture model with variational inference (VI). The learning task in VI consists of optimization of the variational posterior distribution. However, the exact solution by VI does not lead to an analytically tractable solution due to the evaluation of intractable moments involving functional forms of the Bessel function in their arguments. To derive a closed-form solution, we further lower bound the evidence lower bound where the bound is tight at one point in the parameter distribution. While having the value of the bound guaranteed to increase during maximization, we derive an analytically tractable approximation to the posterior distribution which has the same functional form as the assigned prior distribution. The proposed algorithm requires no iterative numerical calculation in the re-estimation procedure, and it can potentially determine the model complexity and avoid the over-fitting problem associated with conventional approaches based on the expectation maximization. Moreover, we derive an analytically tractable approximation to the predictive density of the Bayesian mixture model of vMF distributions. The performance of the proposed approach is verified by experiments with both synthetic and real data.

  • 23.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    A variational Bayes approach to the underdetermined blind source separation with automatic determination of the number of sources2012In: Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on / [ed] IEEE, IEEE , 2012, p. 253-256Conference paper (Refereed)
    Abstract [en]

    In this paper, we propose a variational Bayes approach to the underdetermined blind source separation and show how a variational treatment can open up the possibility of determining the actual number of sources. The procedure is performed in a frequency bin-wise manner. In every frequency bin, we model the time-frequency mixture by a variational mixture of Gaussians with a circular-symmetric complex-Gaussian density function. In the Bayesian inference, we consider appropriate conjugate prior distributions for modeling the parameters of this distribution. The learning task consists of estimating the hyper-parameters characterizing the parameter distributions for the optimization of the variational posterior distribution. The proposed approach requires no prior knowledge on the number of sources in a mixture.

1 - 23 of 23
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf