kth.sePublikationer
Ändra sökning
Avgränsa sökresultatet
1 - 21 av 21
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Hongmei, Hu
    et al.
    ISVR, University of Southampton.
    Mohammadiha, Nasser
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Taghia, Jalil
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Leijon, Arne
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Lutman, Mark E
    ISVR, University of Southampton.
    Wang, Shouyan
    ISVR, University of Southampton.
    Sparsity level in a non-negative matrix factorization based speech strategy in cochlear implants2012Ingår i: 2012 Proceedings Of The 20th European Signal Processing Conference (EUSIPCO), IEEE Computer Society, 2012, s. 2432-2436Konferensbidrag (Refereegranskat)
    Abstract [en]

    Non-negative matrix factorization (NMF) has increasinglybeen used as a tool in signal processing in the last years, butit has not been used in the cochlear implants (CIs). Toimprove the performance of CIs in noisy environments, anovel sparse strategy is proposed by applying NMF onenvelops of 22 channels. In the new algorithm, the noisyspeech is first transferred to the time-frequency domain viaa 22- channel filter bank and the envelope in each frequencychannel is extracted; secondly, NMF is applied to theenvelope matrix (envelopegram); finally, the sparsitycondition is applied to the coefficient matrix to get moresparse representation. Speech reception threshold (SRT)subjective experiment was performed in combination withfive objective measurements in order to choose the properparameters for the sparse NMF model.

    Ladda ner fulltext (pdf)
    fulltext
  • 2. Hu, H.
    et al.
    Taghia, Jalil
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Sang, J.
    Mohammadiha, Nasser
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Azarpour, M.
    Dokku, R.
    Wang, S.
    Lutman, M. E.
    Bleeck, S.
    Speech enhancement via combination of Wiener filter and blind source separation2011Ingår i: Proceedings of the Sixth International Conference on Intelligent Systems and Knowledge Engineering, Shanghai, China  (ISKE2011), 2011, s. 485-494Konferensbidrag (Refereegranskat)
    Abstract [en]

    Automatic speech recognition (ASR) often fails in acoustically noisy environments. Aimed to improve speech recognition scores of an ASR in a real-life like acoustical environment, a speech pre-processing system is proposed in this paper, which consists of several stages: First, a convolutive blind source separation (BSS) is applied to the spectrogram of the signals that are pre-processed by binaural Wiener filtering (BWF). Secondly, the target speech is detected by an ASR system recognition rate based on a Hidden Markov Model (HMM). To evaluate the performance of the proposed algorithm, the signal-to-interference ratio (SIR), the improvement signal-to-noise ratio (ISNR) and the speech recognition rates of the output signals were calculated using the signal corpus of the CHiME database. The results show an improvement in SIR and ISNR, but no obvious improvement of speech recognition scores. Improvements for future research are suggested.

  • 3.
    Leijon, Arne
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.
    von Gablenz, Petra
    Institute of Hearing Technology and Audiology, Jade University of Applied Sciences, Oldenburg, Germany.
    Holube, Inga
    Institute of Hearing Technology and Audiology, Jade University of Applied Sciences, Oldenburg, Germany.
    Taghia, Jalil
    KTH, Skolan för elektroteknik och datavetenskap (EECS).
    Smeds, Karolina
    ORCA Europe, WS Audiology, Stockholm, Sweden.
    Bayesian analysis of Ecological Momentary Assessment (EMA) data collected in adults before and after hearing rehabilitation2023Ingår i: Frontiers in Digital Health, E-ISSN 2673-253X, Vol. 5, artikel-id 1100705Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This paper presents a new Bayesian method for analyzing Ecological Momentary Assessment (EMA) data and applies this method in a re-analysis of data from a previous EMA study. The analysis method has been implemented as a freely available Python package EmaCalc, RRID:SCR 022943. The analysis model can use EMA input data including nominal categories in one or more situation dimensions, and ordinal ratings of several perceptual attributes. The analysis uses a variant of ordinal regression to estimate the statistical relation between these variables. The Bayesian method has no requirements related to the number of participants or the number of assessments by each participant. Instead, the method automatically includes measures of the statistical credibility of all analysis results, for the given amount of data. For the previously collected EMA data, the analysis results demonstrate how the new tool can handle heavily skewed, scarce, and clustered data that were collected on ordinal scales, and present results on interval scales. The new method revealed results for the population mean that were similar to those obtained in the previous analysis by an advanced regression model. The Bayesian approach automatically estimated the inter-individual variability in the population, based on the study sample, and could show some statistically credible intervention results also for an unseen random individual in the population. Such results may be interesting, for example, if the EMA methodology is used by a hearing-aid manufacturer in a study to predict the success of a new signal-processing method among future potential customers.

  • 4.
    Ma, Zhanyu
    et al.
    Beijing Univ Posts & Telecommun, Beijing 100876, Peoples R China..
    Kim, Sunwoo
    Hanyang Univ, Seoul 04763, South Korea..
    Martinez-Gomez, Pascual
    Amazon, Seattle, WA 98109 USA..
    Taghia, Jalil
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Song, Yi-Zhe
    Univ Surrey, Guildford GU2 7XH, Surrey, England..
    Gao, Huiji
    LinkedIn, Sunnyvale, CA 94085 USA..
    IEEE Access Special Section Editorial: AI-Driven Big Data Processing: Theory, Methodology, and Applications2020Ingår i: IEEE Access, E-ISSN 2169-3536, Vol. 8, s. 199882-199898Artikel i tidskrift (Övrigt vetenskapligt)
  • 5. Ma, Zhanyu
    et al.
    Rana, Pravin Kumar
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Taghia, Jalil
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Flierl, Markus
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Leijon, Arne
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Bayesian estimation of Dirichlet mixture model with variational inference2014Ingår i: Pattern Recognition, ISSN 0031-3203, E-ISSN 1873-5142, Vol. 47, nr 9, s. 3143-3157Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In statistical modeling, parameter estimation is an essential and challengeable task. Estimation of the parameters in the Dirichlet mixture model (DMM) is analytically intractable, due to the integral expressions of the gamma function and its corresponding derivatives. We introduce a Bayesian estimation strategy to estimate the posterior distribution of the parameters in DMM. By assuming the gamma distribution as the prior to each parameter, we approximate both the prior and the posterior distribution of the parameters with a product of several mutually independent gamma distributions. The extended factorized approximation method is applied to introduce a single lower-bound to the variational objective function and an analytically tractable estimation solution is derived. Moreover, there is only one function that is maximized during iterations and, therefore, the convergence of the proposed algorithm is theoretically guaranteed. With synthesized data, the proposed method shows the advantages over the EM-based method and the previously proposed Bayesian estimation method. With two important multimedia signal processing applications, the good performance of the proposed Bayesian estimation method is demonstrated.

  • 6. Ma, Zhanyu
    et al.
    Teschendorff, Andrew E.
    Yu, Hong
    Taghia, Jalil
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Guo, Jun
    Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis2014Ingår i: International Journal of Molecular Sciences, ISSN 1661-6596, E-ISSN 1422-0067, Vol. 15, nr 6, s. 10835-10854Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance.

  • 7.
    Mohammadiha, Nasser
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Taghia, Jalil
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Leijon, Arne
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Single channel speech enhancement using Bayesian NMF with recursive temporal updates of prior distributions2012Ingår i: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012, IEEE conference proceedings, 2012, s. 4561-4564Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a speech enhancement algorithm which is based on a Bayesian Nonnegative Matrix Factorization (NMF). Both Minimum Mean Square Error (MMSE) and Maximum a-Posteriori (MAP) estimates of the magnitude of the clean speech DFT coefficients are derived. To exploit the temporal continuity of the speech and noise signals, a proper prior distribution is introduced by widening the posterior distribution of the NMF coefficients at the previous time frames. To do so, a recursive temporal update scheme is proposed to obtain the mean value of the prior distribution; also, the uncertainty of the prior information is governed by the shape parameter of the distribution which is learnt automatically based on the nonstationarity of the signals. Simulations show a considerable improvement compared to the maximum likelihood NMF based speech enhancement algorithm for different input SNRs.

    Ladda ner fulltext (pdf)
    fulltext
  • 8.
    Rana, Pravin Kumar
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Ma, Zhanyu
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Taghia, Jalil
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Flierl, Markus
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Multiview Depth Map Enhancement by Variational Bayes Inference Estimation of Dirichlet Mixture Models2013Ingår i: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE , 2013, s. 1528-1532Konferensbidrag (Refereegranskat)
    Abstract [en]

    High quality view synthesis is a prerequisite for future free-viewpointtelevision. It will enable viewers to move freely in a dynamicreal world scene. Depth image based rendering algorithms willplay a pivotal role when synthesizing an arbitrary number of novelviews by using a subset of captured views and corresponding depthmaps only. Usually, each depth map is estimated individually bystereo-matching algorithms and, hence, shows lack of inter-viewconsistency. This inconsistency affects the quality of view synthesis negatively. This paper enhances the inter-view consistency ofmultiview depth imagery. First, our approach classifies the colorinformation in the multiview color imagery by modeling color witha mixture of Dirichlet distributions where the model parameters areestimated in a Bayesian framework with variational inference. Second, using the resulting color clusters, we classify the correspondingdepth values in the multiview depth imagery. Each clustered depthimage is subject to further sub-clustering. Finally, the resultingmean of each sub-cluster is used to enhance the depth imagery atmultiple viewpoints. Experiments show that our approach improvesthe average quality of virtual views by up to 0.8 dB when comparedto views synthesized by using conventionally estimated depth maps.

  • 9.
    Rana, Pravin Kumar
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Taghia, Jalil
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Flier, Markus
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Statistical methods for inter-view depth enhancement2014Ingår i: 2014 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), IEEE , 2014, s. 6874755-Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper briefly presents and evaluates recent advances in statistical methods for improving inter-view inconsistency in multiview depth imagery. View synthesis is vital in free-viewpoint television in order to allow viewers to move freely in a dynamic scene. Here, depth image-based rendering plays a pivotal role by synthesizing an arbitrary number of novel views by using a subset of captured views and corresponding depth maps only. Usually, each depth map is estimated individually at different viewpoints by stereo matching and, hence, shows lack of inter-view consistency. This lack of consistency affects the quality of view synthesis negatively. This paper discusses two different approaches to enhance the inter-view depth consistency. The first one uses generative models based on multiview color and depth classification to assign a probabilistic weight to each depth pixel. The weighted depth pixels are utilized to enhance depth maps. The second one performs inter-view consistency testing in depth difference space to enhance the depth maps at multiple viewpoints. We comparatively evaluate these two methods and discuss their pros and cons for future work.

  • 10.
    Rana, Pravin Kumar
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Taghia, Jalil
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Flierl, Markus
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    A Variational Bayesian Inference Framework for Multiview Depth Image Enhancement2012Ingår i: Proceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012, IEEE , 2012, s. 183-190Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper, a general model-based framework for multiview depth image enhancement is proposed. Depth imagery plays a pivotal role in emerging free-viewpoint television. This technology requires high quality virtual view synthesis to enable viewers to move freely in a dynamic real world scene. Depth imagery of different viewpoints is used to synthesize an arbitrary number of novel views. Usually, the depth imagery is estimated individually by stereo-matching algorithms and, hence, shows lack of inter-view consistency. This inconsistency affects the quality of view synthesis negatively. This paper enhances the inter-view consistency of multiview depth imagery by using a variational Bayesian inference framework. First, our approach classifies the color information in the multiview color imagery. Second, using the resulting color clusters, we classify the corresponding depth values in the multiview depth imagery. Each clustered depth image is subject to further subclustering. Finally, the resulting mean of the sub-clusters is used to enhance the depth imagery at multiple viewpoints. Experiments show that our approach improves the quality of virtual views by up to 0.25 dB.

  • 11.
    Rana, Pravin Kumar
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Taghia, Jalil
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Ma, Zhanyu
    Flierl, Markus
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Probabilistic Multiview Depth Image Enhancement Using Variational Inference2015Ingår i: IEEE Journal on Selected Topics in Signal Processing, ISSN 1932-4553, E-ISSN 1941-0484, Vol. 9, nr 3, s. 435-448Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    An inference-based multiview depth image enhancement algorithm is introduced and investigated in this paper. Multiview depth imagery plays a pivotal role in free-viewpoint television. This technology requires high-quality virtual view synthesis to enable viewers to move freely in a dynamic real world scene. Depth imagery of different viewpoints is used to synthesize an arbitrary number of novel views. Usually, the depth imagery is estimated individually by stereo-matching algorithms and, hence, shows inter-view inconsistency. This inconsistency affects the quality of view synthesis negatively. This paper enhances the multiview depth imagery at multiple viewpoints by probabilistic weighting of each depth pixel. First, our approach classifies the color pixels in the multiview color imagery. Second, using the resulting color clusters, we classify the corresponding depth values in the multiview depth imagery. Each clustered depth image is subject to further subclustering. Clustering based on generative models is used for assigning probabilistic weights to each depth pixel. Finally, these probabilistic weights are used to enhance the depth imagery at multiple viewpoints. Experiments show that our approach consistently improves the quality of virtual views by 0.2 dB to 1.6 dB, depending on the quality of the input multiview depth imagery.

  • 12. Taghia, J.
    et al.
    Martin, R.
    Taghia, Jalil
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Leijon, Arne
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    An investigation on mutual information for the linear predictive system and the extrapolation of speech signals2012Ingår i: Proceedings of 10th ITG Symposium on Speech Communication, Institute of Electrical and Electronics Engineers (IEEE), 2012, artikel-id 6309620Konferensbidrag (Refereegranskat)
    Abstract [en]

    Mutual information (MI) is an important information theoretic concept which has many applications in telecommunications, in blind source separation, and in machine learning. More recently, it has been also employed for the instrumental assessment of speech intelligibility where traditionally correlation based measures are used. In this paper, we address the difference between MI and correlation from the viewpoint of discovering dependencies between variables in the context of speech signals. We perform our investigation by considering the linear predictive approximation and the extrapolation of speech signals as examples. We compare a parametric MI estimation approach based on a Gaussian mixture model (GMM) with the k-nearest neighbor (KNN) approach which is a well-known non-parametric method available to estimate the MI. We show that the GMM-based MI estimator leads to more consistent results.

  • 13. Taghia, Jalal
    et al.
    Martin, Rainer
    Taghia, Jalil
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Leijon, Arne
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Dual-channel noise reduction based on a mixture of circular-symmetric complex Gaussians on unit hypersphere2013Ingår i: ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, 2013, s. 7289-7293Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper a model-based dual-channel noise reduction approach is presented which is an alternative to conventional noise reduction algorithms essentially due to its independence of the noise power spectral density estimation and of any prior knowledge about the spatial noise field characteristics. We use a mixture of circular-symmetric complex-Gaussian distributions projected on the unit hypersphere for modeling the complex discrete Fourier transform coefficients of noisy speech signals in the frequency domain. According to the derived mixture model, clustering of the noise and the target speech components is performed depending on their direction of arrival. A soft masking strategy is proposed for speech enhancement based on responsibilities assigned to the target speech class in each time-frequency bin. Our experimental results show that the proposed approach is more robust than conventional dual-channel noise reduction systems based on the single- and dual-channel noise power spectral density estimators.

  • 14.
    Taghia, Jalal
    et al.
    Ruhr University of Bochum.
    Taghia, Jalil
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Mohammadiha, Nasser
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Sang, Jinqiu
    University of Southampton.
    Bouse, Vaclav
    Siemens.
    Martin, Rainer
    Ruhr University of Bochum.
    An Evaluation of noise power spectral density estimation algorithms in adverse acoustic environments2011Ingår i: 36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, IEEE , 2011, s. 4640-4643Konferensbidrag (Refereegranskat)
    Abstract [en]

    Noise power spectral density estimation is an important componentof speech enhancement systems due to its considerable effect onthe quality and the intelligibility of the enhanced speech. Recently,many new algorithms have been proposed and significant progressin noise tracking has been made.In this paper, we present an evaluation framework for measuringthe performance of some recently proposed and some well-knownnoise power spectral density estimators and compare their performancein adverse acoustic environments. In this investigation we donot only consider the performance in the mean of a spectral distancemeasure but also evaluate the variance of the estimators as the latteris related to undesirable fluctuations also known as musical noise.By providing a variety of different non-stationary noises, the robustnessof noise estimators in adverse environments is examined.

  • 15.
    Taghia, Jalil
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Gerkmann, Timo
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Leijon, Arne
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Blind Source Separation of Nondisjoint Sources in The Time-Frequency Domain with Model-Based Determination of Source Contribution2011Ingår i: 2011 IEEE International Symposium On Signal Processing And Information Technology (ISSPIT), New York: IEEE , 2011, s. 276-280Konferensbidrag (Refereegranskat)
    Abstract [en]

    While most blind source separation (BSS) algorithms rely on the assumption that at most one source is dominant at each time-frequency (TF) point, recently, two BSS approaches, [1], [2], have been proposed that allow multiple active sources at time-frequency (TF) points under certain assumptions. In both algorithms, the active sources in every single TF point are found by an exhaustive search through an optimization procedure which is computationally expensive. In this work, we address this limitation and avoid the exhaustive search by determining the source contribution in every TF point. The source contributions are expressed by a set of posterior probabilities. Hereby, we propose a model-based blind source separation algorithm that allows sources to be nondisjoint in the TF domain while being computationally more tractable. The proposed BSS approach is shown to be robust with respect to different reverberation times and microphone spacings.

  • 16.
    Taghia, Jalil
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Leijon, Arne
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Separation of Unknown Number of Sources2014Ingår i: IEEE Signal Processing Letters, ISSN 1070-9908, E-ISSN 1558-2361, Vol. 21, nr 5, s. 625-629Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We address the problem of blind source separation in acoustic applications where there is no prior knowledge about the number of mixing sources. The presented method employs a mixture of complex Watson distributions in its generative model with a sparse Dirichlet distribution over the mixture weights. The problem is formulated in a fully Bayesian inference with assuming prior distributions over all model parameters. The presented model can regulate its own complexity by pruning unnecessary components by which we can possibly relax the assumption of prior knowledge on the number of sources.

  • 17.
    Taghia, Jalil
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Leijon, Arne
    KTH, Skolan för elektro- och systemteknik (EES).
    Variational Inference for Watson Mixture Model2016Ingår i: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 38, nr 9, s. 1886-1900Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This paper addresses modelling data using the Watson distribution. The Watson distribution is one of the simplest distributions for analyzing axially symmetric data. This distribution has gained some attention in recent years due to its modeling capability. However, its Bayesian inference is fairly understudied due to difficulty in handling the normalization factor. Recent development of Markov chain Monte Carlo (MCMC) sampling methods can be applied for this purpose. However, these methods can be prohibitively slow for practical applications. A deterministic alternative is provided by variational methods that convert inference problems into optimization problems. In this paper, we present a variational inference for Watson mixture models. First, the variational framework is used to side-step the intractability arising from the coupling of latent states and parameters. Second, the variational free energy is further lower bounded in order to avoid intractable moment computation. The proposed approach provides a lower bound on the log marginal likelihood and retains distributional information over all parameters. Moreover, we show that it can regulate its own complexity by pruning unnecessary mixture components while avoiding over-fitting. We discuss potential applications of the modeling with Watson distributions in the problem of blind source separation, and clustering gene expression data sets.

  • 18.
    Taghia, Jalil
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Ma, Z.
    Leijon, Arne
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    On von-Mises Fisher mixture model in Text-independent speaker identification2013Ingår i: Proceedings of the 2013 INTERSPEECH, 2013, s. 2499-2503Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper addresses text-independent speaker identification (SI) based on line spectral frequencies (LSFs). The LSFs are transformed to differential LSFs (MLSF) in order to exploit their boundary and ordering properties. We show that the square root of MLSF has interesting directional characteristics implying that their distribution can be modeled by a mixture of von-Mises Fisher (vMF) distributions. We analytically estimate the mixture model parameters in a fully Bayesian treatment by using variational inference. In the Bayesian inference, we can potentially determine the model complexity and avoid overfitting problem associated with conventional approaches based on the expectation maximization. The experimental results confirm the effectiveness of the proposed SI system.

  • 19.
    Taghia, Jalil
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Ma, Zhanyu
    Leijon, Arne
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Bayesian Estimation of the von-Mises Fisher Mixture Model with Variational Inference2014Ingår i: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 36, nr 9, s. 1701-1715Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This paper addresses the Bayesian estimation of the von-Mises Fisher (vMF) mixture model with variational inference (VI). The learning task in VI consists of optimization of the variational posterior distribution. However, the exact solution by VI does not lead to an analytically tractable solution due to the evaluation of intractable moments involving functional forms of the Bessel function in their arguments. To derive a closed-form solution, we further lower bound the evidence lower bound where the bound is tight at one point in the parameter distribution. While having the value of the bound guaranteed to increase during maximization, we derive an analytically tractable approximation to the posterior distribution which has the same functional form as the assigned prior distribution. The proposed algorithm requires no iterative numerical calculation in the re-estimation procedure, and it can potentially determine the model complexity and avoid the over-fitting problem associated with conventional approaches based on the expectation maximization. Moreover, we derive an analytically tractable approximation to the predictive density of the Bayesian mixture model of vMF distributions. The performance of the proposed approach is verified by experiments with both synthetic and real data.

  • 20.
    Taghia, Jalil
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori. Institute of Communication Acoustics, Ruhr-Universität Bochum, Bochum, Germany.
    Martin, Rainer
    Institute of Communication Acoustics, Ruhr-Universität Bochum, Bochum.
    Leijon, Arne
    An investigation on mutual information for the linear predictive system and the extrapolation of speech signals2020Ingår i: Sprachkommunikation - 10. ITG-Fachtagung, VDE Verlag GmbH , 2020, s. 227-230Konferensbidrag (Refereegranskat)
    Abstract [en]

    Mutual information (MI) is an important information theoretic concept which has many applications in telecommunications, in blind source separation, and in machine learning. More recently, it has been also employed for the instrumental assessment of speech intelligibility where traditionally correlation based measures are used. In this paper, we address the difference between MI and correlation from the viewpoint of discovering dependencies between variables in the context of speech signals. We perform our investigation by considering the linear predictive approximation and the extrapolation of speech signals as examples. We compare a parametric MI estimation approach based on a Gaussian mixture model (GMM) with the k-nearest neighbor (KNN) approach which is a well-known non-parametric method available to estimate the MI. We show that the GMM-based MI estimator leads to more consistent results.

  • 21.
    Taghia, Jalil
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Mohammadiha, Nasser
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Leijon, Arne
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    A variational Bayes approach to the underdetermined blind source separation with automatic determination of the number of sources2012Ingår i: Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on / [ed] IEEE, IEEE , 2012, s. 253-256Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper, we propose a variational Bayes approach to the underdetermined blind source separation and show how a variational treatment can open up the possibility of determining the actual number of sources. The procedure is performed in a frequency bin-wise manner. In every frequency bin, we model the time-frequency mixture by a variational mixture of Gaussians with a circular-symmetric complex-Gaussian density function. In the Bayesian inference, we consider appropriate conjugate prior distributions for modeling the parameters of this distribution. The learning task consists of estimating the hyper-parameters characterizing the parameter distributions for the optimization of the variational posterior distribution. The proposed approach requires no prior knowledge on the number of sources in a mixture.

1 - 21 av 21
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf