kth.sePublications
Change search
Refine search result
1 - 26 of 26
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Hongmei, Hu
    et al.
    ISVR, University of Southampton.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Lutman, Mark E
    ISVR, University of Southampton.
    Wang, Shouyan
    ISVR, University of Southampton.
    Sparsity level in a non-negative matrix factorization based speech strategy in cochlear implants2012In: 2012 Proceedings Of The 20th European Signal Processing Conference (EUSIPCO), IEEE Computer Society, 2012, p. 2432-2436Conference paper (Refereed)
    Abstract [en]

    Non-negative matrix factorization (NMF) has increasinglybeen used as a tool in signal processing in the last years, butit has not been used in the cochlear implants (CIs). Toimprove the performance of CIs in noisy environments, anovel sparse strategy is proposed by applying NMF onenvelops of 22 channels. In the new algorithm, the noisyspeech is first transferred to the time-frequency domain viaa 22- channel filter bank and the envelope in each frequencychannel is extracted; secondly, NMF is applied to theenvelope matrix (envelopegram); finally, the sparsitycondition is applied to the coefficient matrix to get moresparse representation. Speech reception threshold (SRT)subjective experiment was performed in combination withfive objective measurements in order to choose the properparameters for the sparse NMF model.

    Download full text (pdf)
    fulltext
  • 2. Hu, H.
    et al.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Sang, J.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Azarpour, M.
    Dokku, R.
    Wang, S.
    Lutman, M. E.
    Bleeck, S.
    Speech enhancement via combination of Wiener filter and blind source separation2011In: Proceedings of the Sixth International Conference on Intelligent Systems and Knowledge Engineering, Shanghai, China  (ISKE2011), 2011, p. 485-494Conference paper (Refereed)
    Abstract [en]

    Automatic speech recognition (ASR) often fails in acoustically noisy environments. Aimed to improve speech recognition scores of an ASR in a real-life like acoustical environment, a speech pre-processing system is proposed in this paper, which consists of several stages: First, a convolutive blind source separation (BSS) is applied to the spectrogram of the signals that are pre-processed by binaural Wiener filtering (BWF). Secondly, the target speech is detected by an ASR system recognition rate based on a Hidden Markov Model (HMM). To evaluate the performance of the proposed algorithm, the signal-to-interference ratio (SIR), the improvement signal-to-noise ratio (ISNR) and the speech recognition rates of the output signals were calculated using the signal corpus of the CHiME database. The results show an improvement in SIR and ISNR, but no obvious improvement of speech recognition scores. Improvements for future research are suggested.

  • 3.
    Mohammadiha, Nasser
    Sharif University of Technology.
    Measuring the geometrical parameters of steel billets during the molding process by image processing2006Independent thesis Advanced level (degree of Master (Two Years)), 60 credits / 90 HE creditsStudent thesis
    Abstract [en]

    In this project we present a machine vision system to measure the geometrical parametersand dimensional defects of steel billets (blooms/slabs). Geometrical parameters includewidth, height and length and dimensional defects include camber, rhomboid difference andtorsion. The system has been equipped with a color camera, an industrial computer andother peripheral equipments such as lens, Ethernet Cat-5 cable and camera housing. Digitalimage processing techniques have been used to analyze the single view images. To do so,the image is enhanced first and then it is reregistered with a constant background. Afterthat the billet’s motion is calculated and the image is segmented. The billet boundary linesare then estimated using billets geometrical features. Then the Hough transform isfollowed by the canny edge detector to detect and link the exact sides of billets. Twocalibration methods have been used to transform the measured values in pixel to wordreference values in centimeter. These techniques result in removing the necessity of multicameras that have been used in the same projects by keeping the accuracy.

    Download full text (pdf)
    MS_Thesis
  • 4.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Speech Enhancement Using Nonnegative MatrixFactorization and Hidden Markov Models2013Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Reducing interference noise in a noisy speech recording has been a challenging task for many years yet has a variety of applications, for example, in handsfree mobile communications, in speech recognition, and in hearing aids. Traditional single-channel noise reduction schemes, such as Wiener filtering, do not work satisfactorily in the presence of non-stationary background noise. Alternatively, supervised approaches, where the noise type is known in advance, lead to higher-quality enhanced speech signals. This dissertation proposes supervised and unsupervised single-channel noise reduction algorithms. We consider two classes of methods for this purpose: approaches based on nonnegative matrix factorization (NMF) and methods based on hidden Markov models (HMM).

     The contributions of this dissertation can be divided into three main (overlapping) parts. First, we propose NMF-based enhancement approaches that use temporal dependencies of the speech signals. In a standard NMF, the important temporal correlations between consecutive short-time frames are ignored. We propose both continuous and discrete state-space nonnegative dynamical models. These approaches are used to describe the dynamics of the NMF coefficients or activations. We derive optimal minimum mean squared error (MMSE) or linear MMSE estimates of the speech signal using the probabilistic formulations of NMF. Our experiments show that using temporal dynamics in the NMF-based denoising systems improves the performance greatly. Additionally, this dissertation proposes an approach to learn the noise basis matrix online from the noisy observations. This relaxes the assumption of an a-priori specified noise type and enables us to use the NMF-based denoising method in an unsupervised manner. Our experiments show that the proposed approach with online noise basis learning considerably outperforms state-of-the-art methods in different noise conditions.

     Second, this thesis proposes two methods for NMF-based separation of sources with similar dictionaries. We suggest a nonnegative HMM (NHMM) for babble noise that is derived from a speech HMM. In this approach, speech and babble signals share the same basis vectors, whereas the activation of the basis vectors are different for the two signals over time. We derive an MMSE estimator for the clean speech signal using the proposed NHMM. The objective evaluations and performed subjective listening test show that the proposed babble model and the final noise reduction algorithm outperform the conventional methods noticeably. Moreover, the dissertation proposes another solution to separate a desired source from a mixture with arbitrarily low artifacts.

     Third, an HMM-based algorithm to enhance the speech spectra using super-Gaussian priors is proposed. Our experiments show that speech discrete Fourier transform (DFT) coefficients have super-Gaussian rather than Gaussian distributions even if we limit the speech data to come from a specific phoneme. We derive a new MMSE estimator for the speech spectra that uses super-Gaussian priors. The results of our evaluations using the developed noise reduction algorithm support the super-Gaussianity hypothesis.

    Download full text (pdf)
    Thesis
  • 5.
    Mohammadiha, Nasser
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Arne, Leijon
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Nonnegative HMM for Babble Noise Derived from Speech HMM: Application to Speech Enhancement2013In: IEEE Transactions on Audio, Speech, and Language Processing, ISSN 1558-7916, E-ISSN 1558-7924, Vol. 21, no 5, p. 998-1011Article in journal (Refereed)
    Abstract [en]

    Deriving a good model for multitalker babble noise can facilitate different speech processing algorithms,e.g. noise reduction, to reduce the so-called cocktail party difficulty. In the available systems, thefact that the babble waveform is generated as a sum of N different speech waveforms is not exploitedexplicitly. In this paper, first we develop a gamma hidden Markov model for power spectra of the speechsignal, and then formulate it as a sparse nonnegative matrix factorization (NMF). Second, the sparse NMFis extended by relaxing the sparsity constraint, and a novel model for babble noise (gamma nonnegativeHMM) is proposed in which the babble basis matrix is the same as the speech basis matrix, and only theactivation factors (weights) of the basis vectors are different for the two signals over time. Finally, a noisereduction algorithm is proposed using the derived speech and babble models. All of the stationary modelparameters are estimated using the expectation-maximization (EM) algorithm, whereas the time-varyingparameters, i.e. the gain parameters of speech and babble signals, are estimated using a recursive EMalgorithm. The objective and subjective listening evaluations show that the proposed babble model andthe final noise reduction algorithm significantly outperform the conventional methods.

    Download full text (pdf)
    fulltext
  • 6.
    Mohammadiha, Nasser
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Gerkmann, Timo
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    A New Approach for Speech Enhancement Based on a Constrained Nonnegative Matrix Factorization2011In: IEEE International Symposium on Intelligent Signal Processing and Communication Systems, ISPACS 2011, IEEE , 2011Conference paper (Refereed)
    Abstract [en]

    In this paper, a new approach is presented for singlechannelspeech enhancement which is based on NonnegativeMatrix Factorization (NMF). The proposed scheme combinesthe noise Power Spectral Density (PSD) estimation based ona constrained NMF and Wiener filtering to enhance the noisyspeech. The imposed constraint is motivated by the time correlationof the underlying observations and enforces the NMF togive smoother estimates of the nonnegative factors. Comparedto the standard NMF approach and Wiener filtering based ona recently developed noise PSD estimator, Source to DistortionRatio (SDR) is

    Download full text (pdf)
    fulltext
  • 7.
    Mohammadiha, Nasser
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Gerkmann, Timo
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    A New Linear MMSE Filter for Single Channel Speech Enhancement Based on Nonnegative Matrix Factorization2011In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2011, IEEE , 2011Conference paper (Refereed)
    Abstract [en]

    In this paper, a linear MMSE filter is derived for single-channelspeech enhancement which is based on Nonnegative Matrix Factorization(NMF). Assuming an additive model for the noisy observation,an estimator is obtained by minimizing the mean square errorbetween the clean speech and the estimated speech components inthe frequency domain. In addition, the noise power spectral density(PSD) is estimated using NMF and the obtained noise PSD is usedin a Wiener filtering framework to enhance the noisy speech. Theresults of the both algorithms are compared to the result of the sameWiener filtering framework in which the noise PSD is estimatedusing a recently developed MMSE-based method. NMF based approachesoutperform the Wiener filter with the MMSE-based noisePSD tracker for different measures. Compared to the NMF-basedWiener filtering approach, Source to Distortion Ratio (SDR) is improvedfor the evaluated noise types for different input SNRs usingthe proposed linear MMSE filter.

    Download full text (pdf)
    fulltext
  • 8.
    Mohammadiha, Nasser
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Kleijn, W. Bastiaan
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Gamma Hidden Markov Model as a Probabilistic Nonnegative Matrix Factorization2013In: 2013 Proceedings of the 21st European Signal Processing Conference (EUSIPCO), European Signal Processing Conference , 2013, p. 6811626-Conference paper (Refereed)
    Abstract [en]

    Among different Nonnegative Matrix Factorization (NMF) approaches, probabilistic NMFs are particularly valuable when dealing with stochastic signals, like speech. In the current literature, little attention has been paid to develop NMF methods that take advantage of the temporal dependencies of data. In this paper, we develop a hidden Markov model (HMM) with a gamma distribution as output density function. Then, we reformulate the gamma HMM as a probabilistic NMF. This shows the analogy of the proposed HMM and NMF, and will lead to a new probabilistic NMF approach in which the temporal dependencies are also captured inherently by the model. Furthermore, we propose an expectation maximization (EM) algorithm to estimate all the model parameters. Compared to the available probabilistic NMFs that model data with Poisson, multinomial, or exponential distributions, the proposed NMF is more suitable to be used with continuous-valued data. Our experiments using speech signals shows that the proposed approach leads to a better compromise between sparsity, goodness of fit, and temporal modeling compared to state-of-the-art.

    Download full text (pdf)
    fulltext
  • 9.
    Mohammadiha, Nasser
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    MODEL ORDER SELECTION FOR NON-NEGATIVE MATRIX FACTORIZATIONWITH APPLICATION TO SPEECH ENHANCEMENT2011Report (Other academic)
    Abstract [en]

    This report deals with the application of non-negative matrixfactorization (NMF) in speech processing. A Bayesian NMFis used to find the optimal number of basis vectors for thespeech signal. The result is validated by performing a speechenhancement task for a set of different number of basis vec-tors. The algorithm performance is measured with the Sourceto Distortion Ratio (SDR) that represents the overall qualityof speech. The results show that for medium input SNRs,60 basis vectors for each speaker are sufficient to model thespeech spectrogram. NMF produced better SDR results thana recently developed version of Spectral Subtraction algo-rithm. The window length was found to have a great effecton the results, but zero padding did not influence the results.

    Download full text (pdf)
    MOS Using BNMF
  • 10.
    Mohammadiha, Nasser
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Nonnegative Matrix Factorization Using Projected Gradient Algorithms with Sparseness Constraints2009In: 2009 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2009), NEW YORK: IEEE conference proceedings, 2009, p. 418-423Conference paper (Refereed)
    Abstract [en]

    Recently projected gradient (PG) approaches have found many applications in solving the minimization problems underlying nonnegative matrix factorization (NMF). NMF is a linear representation of data that could lead to sparse result of natural images. To improve the parts-based representation of data some sparseness constraints have been proposed. In this paper the efficiency and execution time of five different PG algorithms and the basic multiplicative algorithm for NMF are compared. The factorization is done for an existing and proposed sparse NMF and the results are compared for all these PG methods. To compare the algorithms the resulted factorizations are used for a hand-written digit classifier

    Download full text (pdf)
    fulltext
  • 11.
    Mohammadiha, Nasser
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Martin, Rainer
    Ruhr-University Bochum.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Spectral Domain Speech Enhancement Using HMM State-Dependent Super-Gaussian Priors2013In: IEEE Signal Processing Letters, ISSN 1070-9908, E-ISSN 1558-2361, Vol. 20, no 3, p. 253-256Article in journal (Refereed)
    Abstract [en]

    The derivation of MMSE estimators for the DFT coefficients of speech signals, given an observed noisy signal and super-Gaussian prior distributions, has received a lot of interest recently. In this letter, we look at the distribution of the periodogram coefficients of different phonemes, and show that they have a gamma distribution with shape parameters less than one. This verifies that the DFT coefficients for not only the whole speech signal but also for individual phonemes have super-Gaussian distributions. We develop a spectral domain speech enhancement algorithm, and derive hidden Markov model (HMM) based MMSE estimators for speech periodogram coefficients under this gamma assumption in both a high uniform resolution and a reduced-resolution Mel domain. The simulations show that the performance is improved using a gamma distribution compared to the exponential case. Moreover, we show that, even though beneficial in some aspects, the Mel-domain processing does not lead to better results than the algorithms in the high-resolution domain.

    Download full text (pdf)
    fulltext
  • 12.
    Mohammadiha, Nasser
    et al.
    Sharif University.
    Sahraeian, M.
    Sharif University.
    Vosoughi Vahdat, Bijan
    Sharif University.
    Azizi, A.
    TSA Company.
    Shah Ahmadi, A.
    TSA Company.
    Measuring the Geometrical Parameters of Steel Billets during Molding Process Using Image Processing2006In: 2006 IEEE International Symposium on Signal Processing and Information Technology, Vols 1 and 2, IEEE , 2006, p. 59-63Conference paper (Refereed)
    Abstract [en]

    This paper presents a designed system for measuringthree dimensions (height, width and length) of steel billets duringmolding process. The system has been equipped with a Giga-Ethernet camera and an industrial computer (IPC). The billet ismoved in front of the inspection system with an external movingmechanism. Image processing techniques and calibration are usedto measure dimensions from the succeeding single view images.

  • 13.
    Mohammadiha, Nasser
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Smaragdis, Paris
    University of Illinois at Urbana-Champaign.
    Arne, Leijon
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Supervised and unsupervised speech enhancement using nonnegative matrix factorization2013In: IEEE Transactions on Audio, Speech, and Language Processing, ISSN 1558-7916, E-ISSN 1558-7924, Vol. 21, no 10, p. 2140-2151Article in journal (Refereed)
    Abstract [en]

    Reducing the interference noise in a monaural noisy speech signal has been a challenging task for many years. Compared to traditional unsupervised speech enhancement methods, e. g., Wiener filtering, supervised approaches, such as algorithms based on hidden Markov models (HMM), lead to higher-quality enhanced speech signals. However, the main practical difficulty of these approaches is that for each noise type a model is required to be trained a priori. In this paper, we investigate a new class of supervised speech denoising algorithms using nonnegative matrix factorization (NMF). We propose a novel speech enhancement method that is based on a Bayesian formulation of NMF (BNMF). To circumvent the mismatch problem between the training and testing stages, we propose two solutions. First, we use an HMM in combination with BNMF (BNMF-HMM) to derive a minimum mean square error (MMSE) estimator for the speech signal with no information about the underlying noise type. Second, we suggest a scheme to learn the required noise BNMF model online, which is then used to develop an unsupervised speech enhancement system. Extensive experiments are carried out to investigate the performance of the proposed methods under different conditions. Moreover, we compare the performance of the developed algorithms with state-of-the-art speech enhancement schemes using various objective measures. Our simulations show that the proposed BNMF-based methods outperform the competing algorithms substantially.

    Download full text (pdf)
    fulltext
  • 14.
    Mohammadiha, Nasser
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Smaragdis, Paris
    University of Illinois at Urbana-Champaign.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Low-artifact Source Separation Using Probabilistic Latent Component Analysis2013In: 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), IEEE Signal Processing Society, 2013, p. 6701837-Conference paper (Refereed)
    Abstract [en]

    We propose a method based on the probabilistic latent componentanalysis (PLCA) in which we use exponential distributions as priorsto decrease the activity level of a given basis vector. A straightforwardapplication of this method is when we try to extract a desiredsource from a mixture with low artifacts. For this purpose, we proposea maximum a posteriori (MAP) approach to identify the commonbasis vectors between two sources. A low-artifact estimate cannow be obtained by using a constraint such that the common basisvectors in the interfering signal’s dictionary tend to remain inactive.We discuss applications of this method in source separationwith similar-gender speakers and in enhancing a speech signal thatis contaminated with babble noise. Our simulations show that theproposed method not only reduces the artifacts but also increasesthe overall quality of the estimated signal.

    Download full text (pdf)
    fulltext
  • 15.
    Mohammadiha, Nasser
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Smaragdis, Paris
    University of Illinois at Urbana-Champaign.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Prediction Based Filtering and Smoothing to Exploit Temporal Dependencies in NMF2013In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE Signal Processing Society, 2013, p. 873-877Conference paper (Refereed)
    Abstract [en]

    Nonnegative matrix factorization is an appealing technique for many audio applications. However, in it's basic form it does not use temporal structure, which is an important source of information in speech processing. In this paper, we propose NMF-based filtering and smoothing algorithms that are related to Kalman filtering and smoothing. While our prediction step is similar to that of Kalman filtering, we develop a multiplicative update step which is more convenient for nonnegative data analysis and in line with existing NMF literature. The proposed smoothing approach introduces an unavoidable processing delay, but the filtering algorithm does not and can be readily used for on-line applications. Our experiments using the proposed algorithms show a significant improvement over the baseline NMF approaches. In the case of speech denoising with factory noise at 0 dB input SNR, the smoothing algorithm outperforms NMF with 3.2 dB in SDR and around 0.5 MOS in PESQ, likewise source separation experiments result in improved performance due to taking advantage of the temporal regularities in speech.

    Download full text (pdf)
    fulltext
  • 16.
    Mohammadiha, Nasser
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Smaragdis, Paris
    University of Illinois at Urbana-Champaign.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Simultaneous Noise Classification and Reduction Using a Priori Learned Models2013In: 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), IEEE Signal Processing Society, 2013, p. 6661951-Conference paper (Refereed)
    Abstract [en]

    Classifying the acoustic environment is an essential part of a practical supervised source separation algorithm where a model is trained for each source offline. In this paper, we present a classification scheme that is combined with a probabilistic nonnegative matrix factorization (NMF) based speech denoising algorithm. We model the acoustic environment with a hidden Markov model (HMM) whose emission distributions are assumed to be of NMF type. We derive a minimum mean square error (MMSE) estimator of clean speech signal in which the state-dependent speech estimators are weighted according to the state posterior probabilities (or probabilities of different noise environments) and are summed. Our experiments show that the proposed method outperforms state-of-the-art substantially and that its performance is very close to an oracle case where the noise type is known in advance.

    Download full text (pdf)
    fulltext
  • 17.
    Mohammadiha, Nasser
    et al.
    University of Oldenburg, Germany.
    Smaragdis, Paris
    University of Illinois .
    Panahandeh, Ghazaleh
    KTH, School of Electrical Engineering (EES), Signal Processing. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Doclo, Simon
    University of Oldenburg, Germany.
    A state-space approach to dynamic nonnegative matrix factorization2015In: IEEE Transactions on Signal Processing, ISSN 1053-587X, E-ISSN 1941-0476, Vol. 63, no 4, p. 949-959Article in journal (Refereed)
    Abstract [en]

    Nonnegative matrix factorization (NMF) has been actively investigated and used in a wide range of problems in the past decade. A significant amount of attention has been given to develop NMF algorithms that are suitable to model time series with strong temporal dependencies. In this paper, we propose a novel state-space approach to perform dynamic NMF (D-NMF). In the proposed probabilistic framework, the NMF coefficients act as the state variables and their dynamics are modeled using a multi-lag nonnegative vector autoregressive (N-VAR) model within the process equation. We use expectation maximization and propose a maximum-likelihood estimation framework to estimate the basis matrix and the N-VAR model parameters. Interestingly, the N-VAR model parameters are obtained by simply applying NMF. Moreover, we derive a maximum a posteriori estimate of the state variables (i.e., the NMF coefficients) that is based on a prediction step and an update step, similarly to the Kalman filter. We illustrate the benefits of the proposed approach using different numerical simulations where D-NMF significantly outperforms its static counterpart. Experimental results for three different applications show that the proposed approach outperforms two state-of-the-art NMF approaches that exploit temporal dependencies, namely a nonnegative hidden Markov model and a frame stacking approach, while it requires less memory and computational power.

    Download full text (pdf)
    fulltext
  • 18.
    Mohammadiha, Nasser
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Single channel speech enhancement using Bayesian NMF with recursive temporal updates of prior distributions2012In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012, IEEE conference proceedings, 2012, p. 4561-4564Conference paper (Refereed)
    Abstract [en]

    We present a speech enhancement algorithm which is based on a Bayesian Nonnegative Matrix Factorization (NMF). Both Minimum Mean Square Error (MMSE) and Maximum a-Posteriori (MAP) estimates of the magnitude of the clean speech DFT coefficients are derived. To exploit the temporal continuity of the speech and noise signals, a proper prior distribution is introduced by widening the posterior distribution of the NMF coefficients at the previous time frames. To do so, a recursive temporal update scheme is proposed to obtain the mean value of the prior distribution; also, the uncertainty of the prior information is governed by the shape parameter of the distribution which is learnt automatically based on the nonstationarity of the signals. Simulations show a considerable improvement compared to the maximum likelihood NMF based speech enhancement algorithm for different input SNRs.

    Download full text (pdf)
    fulltext
  • 19.
    Mohammadiha, Nasser
    et al.
    Sharif University.
    Vosoughi Vahdat, B.
    Sharif University.
    Fatemizadeh, E.
    Sharif University.
    A Machine Vision Measurement of Still Billet Camber During Molding Process2007In: Pattern recognition and information processing, PRIP'2007: Proceeings of the ninth international conference, 22-24 may 2007, Minsk, Belarus, Vol I-II, Minsk: United Institute of Informatics Problems of National Academy of Sciences of Belarus , 2007, p. 11-17Conference paper (Refereed)
    Download full text (pdf)
    Camber Measurement
  • 20.
    Panahandeh, Ghazaleh
    et al.
    KTH, School of Electrical Engineering (EES), Signal Processing. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Communication Theory. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Jansson, Magnus
    KTH, School of Electrical Engineering (EES), Signal Processing. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Ground Plane Feature Detection in Mobile Vision-Aided Inertial Navigation2012In: Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on / [ed] IEEE, IEEE , 2012, p. 3605-3611Conference paper (Refereed)
    Abstract [en]

    In this paper, a method for determining ground plane features in a sequence of images captured by a mobile camera is presented. The hardware of the mobile system consists of a monocular camera that is mounted on an inertial measurement unit (IMU). An image processing procedure is proposed, first to extract image features and match them across consecutive image frames, and second to detect the ground plane features using a two-step algorithm. In the first step, the planar homography of the ground plane is constructed using an IMU-camera motion estimation approach. The obtained homography constraints are used to detect the most likely ground features in the sequence of images. To reject the remaining outliers, as the second step, a new plane normal vector computation approach is proposed. To obtain the normal vector of the ground plane, only three pairs of corresponding features are used for a general camera transformation. The normal-based computation approach generalizes the existing methods that are developed for specific camera transformations. Experimental results on real data validate the reliability of the proposed method.

    Download full text (pdf)
    fulltext
  • 21.
    Panahandeh, Ghazaleh
    et al.
    KTH, School of Electrical Engineering (EES), Signal Processing.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Kasaei, Shohreh
    Sharif University.
    A Fast and Adaptive Boundary Matching Algorithm for Video Error Concealment2010In: 4th International Conference on Signal Processing and Communication Systems, ICSPCS'2010, IEEE , 2010Conference paper (Refereed)
    Abstract [en]

    Low-complexity error concealment techniques for missing macroblock (MB) recovery based on the boundary matching principle are extensively studied and evaluated. In this paper, an improved boundary matching algorithm (BMA) using adaptive search is presented to conceal channel errors in inter-frames of video images. The proposed scheme adaptively selects proper candidate regions to conceal the artifact of a lost block. The candidate regions are examined based on analyzing motion activity of the neighboring MBs. Simulations show that the proposed scheme outperforms both on PSNR and visual quality obviously of about 1?4dB compared to existing methods.

  • 22.
    Panahandeh, Ghazaleh
    et al.
    KTH, School of Electrical Engineering (EES), Signal Processing.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Händel, Peter
    KTH, School of Electrical Engineering (EES), Signal Processing.
    Chest-Mounted Inertial Measurement Unit for Pedestrian Motion Classification Using Continuous Hidden Markov Model2012In: 2012 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), IEEE , 2012, p. 991-995Conference paper (Refereed)
    Abstract [en]

    This paper presents a method for pedestrian motionclassification based on MEMS inertial measurement unit (IMU)mounted on the chest. The choice of mounting the IMU on thechest provides the potential application of the current study incamera-aided inertial navigation for positioning and personalassistance. In the present work, five categories of the pedestrianmotion including standing, walking, running, going upstairs,and going down the stairs are considered in the classificationprocedure. As the classification method, the continuous hiddenMarkov model (HMM) is used in which the output densityfunctions are assumed to be Gaussian mixture models (GMMs).The correct recognition rates based on the experimental resultsare about 95%.

    Download full text (pdf)
    fulltext
  • 23.
    Panahandeh, Ghazaleh
    et al.
    KTH, School of Electrical Engineering (EES), Signal Processing. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Händel, Peter
    KTH, School of Electrical Engineering (EES), Signal Processing. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Continuous Hidden Markov Model for Pedestrian Activity Classification and Gait Analysis2013In: IEEE Transactions on Instrumentation and Measurement, ISSN 0018-9456, E-ISSN 1557-9662, Vol. 62, no 5, p. 1073-1083Article in journal (Refereed)
    Abstract [en]

    This paper presents a method for pedestrian activity classification and gait analysis based on the microelectromechanical-systems inertial measurement unit (IMU). The work targets two groups of applications, including the following: 1) human activity classification and 2) joint human activity and gait-phase classification. In the latter case, the gait phase is defined as a substate of a specific gait cycle, i.e., the states of the body between the stance and swing phases. We model the pedestrian motion with a continuous hidden Markov model (HMM) in which the output density functions are assumed to be Gaussian mixture models. For the joint activity and gait-phase classification, motivated by the cyclical nature of the IMU measurements, each individual activity is modeled by a "circular HMM." For both the proposed classification methods, proper feature vectors are extracted from the IMU measurements. In this paper, we report the results of conducted experiments where the IMU was mounted on the humans' chests. This permits the potential application of the current study in camera-aided inertial navigation for positioning and personal assistance for future research works. Five classes of activity, including walking, running, going upstairs, going downstairs, and standing, are considered in the experiments. The performance of the proposed methods is illustrated in various ways, and as an objective measure, the confusion matrix is computed and reported. The achieved relative figure of merits using the collected data validates the reliability of the proposed methods for the desired applications.

    Download full text (pdf)
    fulltext
  • 24.
    Panahandeh, Ghazaleh
    et al.
    KTH, School of Electrical Engineering (EES), Signal Processing. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Händel, Peter
    KTH, School of Electrical Engineering (EES), Signal Processing. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Pedestrian Motion Classification via Body-Mounted Inertial Measurement Unit2012Conference paper (Other academic)
  • 25.
    Taghia, Jalal
    et al.
    Ruhr University of Bochum.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Sang, Jinqiu
    University of Southampton.
    Bouse, Vaclav
    Siemens.
    Martin, Rainer
    Ruhr University of Bochum.
    An Evaluation of noise power spectral density estimation algorithms in adverse acoustic environments2011In: 36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, IEEE , 2011, p. 4640-4643Conference paper (Refereed)
    Abstract [en]

    Noise power spectral density estimation is an important componentof speech enhancement systems due to its considerable effect onthe quality and the intelligibility of the enhanced speech. Recently,many new algorithms have been proposed and significant progressin noise tracking has been made.In this paper, we present an evaluation framework for measuringthe performance of some recently proposed and some well-knownnoise power spectral density estimators and compare their performancein adverse acoustic environments. In this investigation we donot only consider the performance in the mean of a spectral distancemeasure but also evaluate the variance of the estimators as the latteris related to undesirable fluctuations also known as musical noise.By providing a variety of different non-stationary noises, the robustnessof noise estimators in adverse environments is examined.

  • 26.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    A variational Bayes approach to the underdetermined blind source separation with automatic determination of the number of sources2012In: Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on / [ed] IEEE, IEEE , 2012, p. 253-256Conference paper (Refereed)
    Abstract [en]

    In this paper, we propose a variational Bayes approach to the underdetermined blind source separation and show how a variational treatment can open up the possibility of determining the actual number of sources. The procedure is performed in a frequency bin-wise manner. In every frequency bin, we model the time-frequency mixture by a variational mixture of Gaussians with a circular-symmetric complex-Gaussian density function. In the Bayesian inference, we consider appropriate conjugate prior distributions for modeling the parameters of this distribution. The learning task consists of estimating the hyper-parameters characterizing the parameter distributions for the optimization of the variational posterior distribution. The proposed approach requires no prior knowledge on the number of sources in a mixture.

1 - 26 of 26
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf