kth.sePublications
Change search
Refine search result
12 51 - 79 of 79
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 51. Molin, Elisabet
    et al.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Wallsten, H
    Spectro-temporal discrimination in cochlear implant users2005In: 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2005, p. 25-28Conference paper (Refereed)
    Abstract [en]

    A novel test method was designed to measure the spectro-temporal discrimination ability of cochlear implant (CI) users. The test signals are bandpass filtered, speech weighted noise, with the long term spectrum of speech. The goal of the test is to measure the amplitude difference between spectral bands in two presented signals which is required for the listener to just discriminate between the two sounds. Twenty Cl users were tested with the spectro-temporal discrimination test and a conventional speech recognition test. For test stimuli differing in only two spectral bands Cl and normal hearing users show nearly equally good results on the spectro-temporal test. For spectra with four and more bands the spread in the Cl users' results is great.

  • 52.
    Nordqvist, Peter
    et al.
    KTH, Superseded Departments (pre-2005), Speech, Music and Hearing.
    Leijon, Arne
    KTH, Superseded Departments (pre-2005), Speech, Music and Hearing.
    An efficient robust sound classification algorithm for hearing aids2004In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 115, no 6, p. 3033-3041Article in journal (Refereed)
    Abstract [en]

    An efficient robust sound classification algorithm based on hidden Markov models is presented. The system would enable a hearing aid to automatically change its behavior for differing listening environments according to the user's preferences. This work attempts to distinguish between three listening environment categories: speech in traffic noise, speech in babble, and clean speech, regardless of the signal-to-noise ratio. The classifier uses only the modulation characteristics of the signal. The classifier ignores the absolute sound pressure level and the absolute spectrum shape, resulting in an algorithm that is robust against irrelevant acoustic variations. The measured classification hit rate was 96.7%-99.5% when the classifier was tested with sounds representing one of the three environment categories included in the classifier. False-alarm rates were 0.2%-1.7% in these tests. The algorithm is robust and efficient and consumes a small amount of instructions and memory. It is fully possible to implement-the classifier in a DSP-based hearing instrument.

  • 53.
    Nordqvist, Peter
    et al.
    KTH, Superseded Departments (pre-2005), Signals, Sensors and Systems.
    Leijon, Arne
    KTH, Superseded Departments (pre-2005), Signals, Sensors and Systems.
    Automatic classification of the telephone listening environment in a hearing aid2002In: Trita-TMH / Royal Institute of Technology, Speech, Music and Hearing, ISSN 1104-5787, Vol. 43, no 1, p. 45-49Article in journal (Refereed)
    Abstract [en]

    An algorithm is developed for automatic classification of the telephone-listening environment in a hearing instrument. The system would enable the hearing aid to automatically change its behavior when it is used for a telephone conversation (e.g., decrease the amplification in the hearing aid, or adapt the feedback suppression algorithm for reflections from the telephone handset). Two listening environments are included in the classifier. The first is a telephone conversation in quiet or in traffic noise and the second is a face-to-face conversation in quiet or in traffic. Each listening environment is modeled with two or three discrete Hidden Markov Models. The probabilities for the different listening environments are calculated with the forward algorithm for each frame of the input sound, and are compared with each other in order to detect the telephone-listening environment. The results indicate that the classifier can distinguish between the two listening environments used in the test material: telephone conversation and face-to-face conversation.

  • 54.
    Nordqvist, Peter
    et al.
    KTH, Superseded Departments (pre-2005), Speech, Music and Hearing.
    Leijon, Arne
    KTH, Superseded Departments (pre-2005), Speech, Music and Hearing.
    Hearing-aid automatic gain control adapting to two sound sources in the environment, using three time constants2004In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 116, no 5, p. 3152-3155Article in journal (Refereed)
    Abstract [en]

    A hearing aid AGC algorithm is presented that uses a richer representation of the sound environment than previous algorithms. The proposed algorithm is designed to (1) adapt slowly (in approximately 10 s) between different listening environments, e.g., when the user leaves a single talker lecture for a multi-babble coffee-break; (2) switch rapidly (about 100 ms) between different dominant sound sources within one listening situation, such as the change from the user's own voice to a distant speaker's voice in a quiet conference room; (3) instantly reduce gain for strong transient sounds and then quickly return to the previous gain setting; and (4) not change the gain in silent pauses but instead keep the gain setting of the previous sound source. An acoustic evaluation showed that the algorithm worked as intended. The algorithm was evaluated together with a reference algorithm in 4 pilot field test. When evaluated by nine users in a set of speech recognition tests, the algorithm showed similar results to the reference algorithm.

  • 55.
    Nordqvist, Peter
    et al.
    KTH, Superseded Departments (pre-2005), Signals, Sensors and Systems.
    Leijon, Arne
    KTH, Superseded Departments (pre-2005), Signals, Sensors and Systems.
    Speech Recognition in Hearing Aids2004In: EURASIP Journal on Wireless Communications and Networking, ISSN 1687-1472, E-ISSN 1687-1499Article in journal (Other academic)
  • 56.
    Panahandeh, Ghazaleh
    et al.
    KTH, School of Electrical Engineering (EES), Signal Processing.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Händel, Peter
    KTH, School of Electrical Engineering (EES), Signal Processing.
    Chest-Mounted Inertial Measurement Unit for Pedestrian Motion Classification Using Continuous Hidden Markov Model2012In: 2012 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), IEEE , 2012, p. 991-995Conference paper (Refereed)
    Abstract [en]

    This paper presents a method for pedestrian motionclassification based on MEMS inertial measurement unit (IMU)mounted on the chest. The choice of mounting the IMU on thechest provides the potential application of the current study incamera-aided inertial navigation for positioning and personalassistance. In the present work, five categories of the pedestrianmotion including standing, walking, running, going upstairs,and going down the stairs are considered in the classificationprocedure. As the classification method, the continuous hiddenMarkov model (HMM) is used in which the output densityfunctions are assumed to be Gaussian mixture models (GMMs).The correct recognition rates based on the experimental resultsare about 95%.

    Download full text (pdf)
    fulltext
  • 57.
    Panahandeh, Ghazaleh
    et al.
    KTH, School of Electrical Engineering (EES), Signal Processing. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Händel, Peter
    KTH, School of Electrical Engineering (EES), Signal Processing. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Continuous Hidden Markov Model for Pedestrian Activity Classification and Gait Analysis2013In: IEEE Transactions on Instrumentation and Measurement, ISSN 0018-9456, E-ISSN 1557-9662, Vol. 62, no 5, p. 1073-1083Article in journal (Refereed)
    Abstract [en]

    This paper presents a method for pedestrian activity classification and gait analysis based on the microelectromechanical-systems inertial measurement unit (IMU). The work targets two groups of applications, including the following: 1) human activity classification and 2) joint human activity and gait-phase classification. In the latter case, the gait phase is defined as a substate of a specific gait cycle, i.e., the states of the body between the stance and swing phases. We model the pedestrian motion with a continuous hidden Markov model (HMM) in which the output density functions are assumed to be Gaussian mixture models. For the joint activity and gait-phase classification, motivated by the cyclical nature of the IMU measurements, each individual activity is modeled by a "circular HMM." For both the proposed classification methods, proper feature vectors are extracted from the IMU measurements. In this paper, we report the results of conducted experiments where the IMU was mounted on the humans' chests. This permits the potential application of the current study in camera-aided inertial navigation for positioning and personal assistance for future research works. Five classes of activity, including walking, running, going upstairs, going downstairs, and standing, are considered in the experiments. The performance of the proposed methods is illustrated in various ways, and as an objective measure, the confusion matrix is computed and reported. The achieved relative figure of merits using the collected data validates the reliability of the proposed methods for the desired applications.

    Download full text (pdf)
    fulltext
  • 58.
    Panahandeh, Ghazaleh
    et al.
    KTH, School of Electrical Engineering (EES), Signal Processing. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Händel, Peter
    KTH, School of Electrical Engineering (EES), Signal Processing. KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre.
    Pedestrian Motion Classification via Body-Mounted Inertial Measurement Unit2012Conference paper (Other academic)
  • 59. Smeds, K.
    et al.
    Leijon, Arne
    KTH, Superseded Departments (pre-2005), Signals, Sensors and Systems.
    Threshold-based fitting methods for non-linear (WDRC) hearing instruments - comparison of acoustic characteristics2001In: Scandinavian Audiology, ISSN 0105-0397, E-ISSN 1940-2872, Vol. 30, no 4, p. 213-222Article in journal (Refereed)
    Abstract [en]

    Six threshold-based prescriptive methods for non-linear hearing instruments were compared for a standard audiogram and three simulated listening situations. Six hearing aids were programmed according to the manufacturers' recommended initial fittings for the specified audiogram. Coupler gain measurements were then made with speech-like signals, and loudness and speech intelligibility index (SII) were calculated. Large differences between estimated insertion gain-frequency responses were seen. These differences resulted in large differences in calculated loudness, whereas the SII calculations showed only small differences between the fitting methods. For two of the methods, DSL[i/o] and FIG6, a comparison between the original prescriptions and the heating aid manufacturers' implementations of the prescriptions was made. The results showed large differences between prescribed and implemented gain.

  • 60. Smeds, K.
    et al.
    Wolters, F.
    Nilsson, Anders Christian
    KTH, School of Engineering Sciences (SCI), Aeronautical and Vehicle Engineering, Marcus Wallenberg Laboratory MWL. Widex A/S ORCA Europe, Stockholm, Sweden .
    Båsjö, S.
    Hertzman, S.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Objective measures to quantify the perceptual effects of noise reduction in hearing aids2011In: Proceedings of the AES International Conference, 2011, p. 101-108Conference paper (Refereed)
    Abstract [en]

    Twenty listeners with hearing impairment evaluated three noise-reduction algorithms using paired comparisons of speech clarity, noise loudness, and preference. The subjective test produces results in terms of physical signal-to-noise ratios that correspond to equal subjective performance with and without the noise-reduction algorithms. This facilitates a direct test of how well a number of objective performance measures correspond with the subjective test results.

  • 61.
    Smeds, Karolina
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Keidser, G.
    Zakis, J.
    Dillon, H.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Grant, F.
    Convery, E.
    Brew, C.
    Preferred overall loudness. I: Sound field presentation in the laboratory2006In: International Journal of Audiology, ISSN 1499-2027, E-ISSN 1708-8186, Vol. 45, no 1, p. 2-11Article in journal (Refereed)
    Abstract [en]

    This study questions the basic assumption that prescriptive methods for nonlinear, wide dynamic range compression (WDRC) hearing aids should restore overall loudness to normal. Fifteen normal-hearing listeners and twenty-four hearing-impaired listeners (with mild to moderate hearing loss, twelve with and twelve without hearing aid experience) participated in laboratory tests. The participants first watched and listened to video sequences and rated how loud and how interesting the situations were. For the hearing-impaired participants, gain was applied according to the NAL-NL1 prescription. Despite the fact that the NAL-NL1 prescription led to less than normal overall calculated loudness, according to the loudness model of Moore and Glasberg (1997), the hearing-impaired participants rated loudness higher than the normal-hearing participants. The participants then adjusted a volume control to preferred overall loudness. Both normal-hearing and hearing-impaired participants preferred less than normal overall calculated loudness. The results from the two groups of hearing-impaired listeners did not differ significantly.

  • 62.
    Smeds, Karolina
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Keidser, G.
    Zakis, J.
    Dillon, H.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Grant, F.
    Convery, E.
    Brew, C.
    Preferred overall loudness. II: Listening through hearing aids in field and laboratory tests2006In: International Journal of Audiology, ISSN 1499-2027, E-ISSN 1708-8186, Vol. 45, no 1, p. 12-25Article in journal (Refereed)
    Abstract [en]

    In a laboratory study, we found that normal-hearing and hearing-impaired listeners preferred less than normal overall calculated loudness (according to a loudness model of Moore & Glasberg, 1997). The current study verified those results using a research hearing aid. Fifteen hearing-impaired and eight normal-hearing participants used the hearing aid in the Field and adjusted a volume control to give preferred loudness. The hearing aid logged the preferred volume control setting and the calculated loudness at that setting. The hearing-impaired participants preferred, in median, loudness levels of -14 phon re normal for input levels from 50 to 89 dB SPL. The normal-hearing participants preferred close to normal overall loudness. In subsequent laboratory tests, using the same hearing aid, both hearing-impaired and normal-hearing listeners preferred less than normal overall calculated loudness, and larger reductions for higher input levels. In summary, the hearing-impaired listeners preferred less than normal overall calculated loudness, whereas the results for the normal-hearing listeners were inconclusive.

  • 63.
    Smeds, Karolina
    et al.
    KTH, School of Electrical Engineering (EES).
    Leijon, Arne
    KTH, School of Electrical Engineering (EES).
    Hörapparatutprovning2000Book (Other academic)
  • 64. Smeds, Karolina
    et al.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Loudness and hearing loss2011In: Loudness / [ed] M Florentine, A N Popper, R R Fay, Springer-Verlag New York, 2011, p. 223-259Chapter in book (Other academic)
    Abstract [en]

    A hearing loss affects many aspects of sound perception. The study of loudness in relation to hearing loss is scientifically important and interesting for several reasons. It is clinically interesting to understand the physiological reasons for the abnormal loudness perception that is common in people with hearing losses. Knowledge about individual loudness perception is central for the habilitation/ rehabilitation of people with impaired hearing. Hearing aids are designed and individually adjusted to compensate, as much as possible, for abnormal loudness perception.General knowledge about loudness perception can be gained by studying the effects of hearing loss on loudness perception.

  • 65.
    Smeds, Karolina
    et al.
    ORCA-Europe/Widex.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory. ORCA-Europe/Widex.
    Wolters, Florian
    ORCA-Europe/Widex.
    Hammarstedt, Anders
    ORCA-Europe/Widex.
    Båsjö, Sara
    ORCA-Europe/Widex.
    Hertzman, Sofia
    ORCA-Europe/Widex.
    Comparison of predictive measures of speech recognition after noise reduction processing2014In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 136, no 3, p. 1363-1374Article in journal (Refereed)
    Abstract [en]

    A number of measures were evaluated with regard to their ability to predict the speech-recognition benefit of single-channel noise reduction (NR) processing. Three NR algorithms and a reference condition were used in the evaluation. Twenty listeners with impaired hearing and ten listeners with normal hearing participated in a blinded laboratory study. An adaptive speech test was used. The speech test produces results in terms of signal-to-noise ratios that correspond to equal speech recognition performance (in this case 80% correct) with and without the NR algorithms. This facilitates a direct comparison between predicted and experimentally measured effects of noise reduction algorithms on speech recognition. The experimental results were used to evaluate nine different predictive measures, one in two variants. The best predictions were found with the Coherence Speech Intelligibility Index (CSII) [Kates and Arehart (2005), J. Acoust. Soc. Am. 117(4), 2224-2237]. In general, measures using correlation between the clean speech and the processed noisy speech, as well as other measures that are based on short-time analysis of speech and noise, seemed most promising.

  • 66.
    Stadler, Svante
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Prediction of Speech Recognition in Cochlear Implant Users by Adapting Auditory Models to Psychophysical Data2009In: Eurasip Journal on Advances in Signal Processing, ISSN 1687-6172, Vol. 2009, p. 175243-Article in journal (Refereed)
    Abstract [en]

    Users of cochlear implants (CIs) vary widely in their ability to recognize speech in noisy conditions. There are many factors that may influence their performance. We have investigated to what degree it can be explained by the users' ability to discriminate spectral shapes. A speech recognition task has been simulated using both a simple and a complex models of CI hearing. The models were individualized by adapting their parameters to fit the results of a spectral discrimination test. The predicted speech recognition performance was compared to experimental results, and they were significantly correlated. The presented framework may be used to simulate the effects of changing the CI encoding strategy.

  • 67.
    Stadler, Svante
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Dijkstra, T.
    de Vries, B.
    Bayesian Optimal Pure Tone Audiometry with Prior KnowledgeManuscript (preprint) (Other academic)
  • 68.
    Stadler, Svante
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Hagerman, Björn
    An Information Theoretic Approach to Predict Speech Intelligibility for Listeners with Normal and Impaired Hearing2007In: INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, BAIXAS, FRANCE: ISCA-INST SPEECH COMMUNICATION ASSOC , 2007, p. 1345-1348Conference paper (Refereed)
    Abstract [en]

    A computational method to predict speech intelligibility in noisy environments has been developed. By modeling speech and noise as stochastic signals, the information transmission through a given auditory model can be estimated. Rate-distortion theory is then applied to predict speech recognition performance. Results are compared with subjective tests on normal and hearing impaired listeners. It is found that the method underestimates the supra-threshold deficits of hearing impairment, which is believed to be due to an overly simple auditory model and a small dictionary size.

  • 69.
    Taal, Cees H.
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Jensen, Jesper
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    On Optimal Linear Filtering of Speech for Near-End Listening Enhancement2013In: IEEE Signal Processing Letters, ISSN 1070-9908, E-ISSN 1558-2361, Vol. 20, no 3, p. 225-228Article in journal (Refereed)
    Abstract [en]

    In this letter the focus is on linear filtering of speech before degradation due to additive background noise. The goal is to design the filter such that the speech intelligibility index (SII) is maximized when the speech is played back in a known noisy environment. Moreover, a power constraint is taken into account to prevent uncomfortable playback levels and deal with loudspeaker constraints. Previous methods use linear approximations of the SII in order to find a closed-form solution. However, as we show, these linear approximations introduce errors in low SNR regions and are therefore suboptimal. In this work we propose a nonlinear approximation of the SII which is accurate for all SNRs. Experiments show large intelligibility improvements with the proposed method over the unprocessed noisy speech and better performance than one state-of-the art method.

  • 70. Taghia, J.
    et al.
    Martin, R.
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    An investigation on mutual information for the linear predictive system and the extrapolation of speech signals2012In: Proceedings of 10th ITG Symposium on Speech Communication, Institute of Electrical and Electronics Engineers (IEEE), 2012, article id 6309620Conference paper (Refereed)
    Abstract [en]

    Mutual information (MI) is an important information theoretic concept which has many applications in telecommunications, in blind source separation, and in machine learning. More recently, it has been also employed for the instrumental assessment of speech intelligibility where traditionally correlation based measures are used. In this paper, we address the difference between MI and correlation from the viewpoint of discovering dependencies between variables in the context of speech signals. We perform our investigation by considering the linear predictive approximation and the extrapolation of speech signals as examples. We compare a parametric MI estimation approach based on a Gaussian mixture model (GMM) with the k-nearest neighbor (KNN) approach which is a well-known non-parametric method available to estimate the MI. We show that the GMM-based MI estimator leads to more consistent results.

  • 71. Taghia, Jalal
    et al.
    Martin, Rainer
    Taghia, Jalil
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Dual-channel noise reduction based on a mixture of circular-symmetric complex Gaussians on unit hypersphere2013In: ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, 2013, p. 7289-7293Conference paper (Refereed)
    Abstract [en]

    In this paper a model-based dual-channel noise reduction approach is presented which is an alternative to conventional noise reduction algorithms essentially due to its independence of the noise power spectral density estimation and of any prior knowledge about the spatial noise field characteristics. We use a mixture of circular-symmetric complex-Gaussian distributions projected on the unit hypersphere for modeling the complex discrete Fourier transform coefficients of noisy speech signals in the frequency domain. According to the derived mixture model, clustering of the noise and the target speech components is performed depending on their direction of arrival. A soft masking strategy is proposed for speech enhancement based on responsibilities assigned to the target speech class in each time-frequency bin. Our experimental results show that the proposed approach is more robust than conventional dual-channel noise reduction systems based on the single- and dual-channel noise power spectral density estimators.

  • 72.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Gerkmann, Timo
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Blind Source Separation of Nondisjoint Sources in The Time-Frequency Domain with Model-Based Determination of Source Contribution2011In: 2011 IEEE International Symposium On Signal Processing And Information Technology (ISSPIT), New York: IEEE , 2011, p. 276-280Conference paper (Refereed)
    Abstract [en]

    While most blind source separation (BSS) algorithms rely on the assumption that at most one source is dominant at each time-frequency (TF) point, recently, two BSS approaches, [1], [2], have been proposed that allow multiple active sources at time-frequency (TF) points under certain assumptions. In both algorithms, the active sources in every single TF point are found by an exhaustive search through an optimization procedure which is computationally expensive. In this work, we address this limitation and avoid the exhaustive search by determining the source contribution in every TF point. The source contributions are expressed by a set of posterior probabilities. Hereby, we propose a model-based blind source separation algorithm that allows sources to be nondisjoint in the TF domain while being computationally more tractable. The proposed BSS approach is shown to be robust with respect to different reverberation times and microphone spacings.

  • 73.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Bayesian Recursive Blind Source SeparationIn: Journal of machine learning research, ISSN 1532-4435, E-ISSN 1533-7928Article in journal (Other academic)
    Abstract [en]

    We consider the problem of blind source separation (BSS) of convolutive mixtures in underdeterminedscenarios, where there are more sources to estimate than recorded signals. This problemhas been intensively studied in the literature. Many successful methods relay on batch processingof previously recorded signals, and hence are only best suited for noncausal systems. This paperaddresses the problem of online BSS. To realize this, we develop a Bayesian recursive framework.The proposed Bayesian framework allows incorporating prior knowledge in a coherentway, and therecursive learning allows to combine information gained from the current observation with all informationfromthe previous observations. Experiments using live audio recordings show promisingresults.

  • 74.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Separation of Unknown Number of Sources2014In: IEEE Signal Processing Letters, ISSN 1070-9908, E-ISSN 1558-2361, Vol. 21, no 5, p. 625-629Article in journal (Refereed)
    Abstract [en]

    We address the problem of blind source separation in acoustic applications where there is no prior knowledge about the number of mixing sources. The presented method employs a mixture of complex Watson distributions in its generative model with a sparse Dirichlet distribution over the mixture weights. The problem is formulated in a fully Bayesian inference with assuming prior distributions over all model parameters. The presented model can regulate its own complexity by pruning unnecessary components by which we can possibly relax the assumption of prior knowledge on the number of sources.

  • 75.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES).
    Variational Inference for Watson Mixture Model2016In: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 38, no 9, p. 1886-1900Article in journal (Refereed)
    Abstract [en]

    This paper addresses modelling data using the Watson distribution. The Watson distribution is one of the simplest distributions for analyzing axially symmetric data. This distribution has gained some attention in recent years due to its modeling capability. However, its Bayesian inference is fairly understudied due to difficulty in handling the normalization factor. Recent development of Markov chain Monte Carlo (MCMC) sampling methods can be applied for this purpose. However, these methods can be prohibitively slow for practical applications. A deterministic alternative is provided by variational methods that convert inference problems into optimization problems. In this paper, we present a variational inference for Watson mixture models. First, the variational framework is used to side-step the intractability arising from the coupling of latent states and parameters. Second, the variational free energy is further lower bounded in order to avoid intractable moment computation. The proposed approach provides a lower bound on the log marginal likelihood and retains distributional information over all parameters. Moreover, we show that it can regulate its own complexity by pruning unnecessary mixture components while avoiding over-fitting. We discuss potential applications of the modeling with Watson distributions in the problem of blind source separation, and clustering gene expression data sets.

  • 76.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Ma, Z.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    On von-Mises Fisher mixture model in Text-independent speaker identification2013In: Proceedings of the 2013 INTERSPEECH, 2013, p. 2499-2503Conference paper (Refereed)
    Abstract [en]

    This paper addresses text-independent speaker identification (SI) based on line spectral frequencies (LSFs). The LSFs are transformed to differential LSFs (MLSF) in order to exploit their boundary and ordering properties. We show that the square root of MLSF has interesting directional characteristics implying that their distribution can be modeled by a mixture of von-Mises Fisher (vMF) distributions. We analytically estimate the mixture model parameters in a fully Bayesian treatment by using variational inference. In the Bayesian inference, we can potentially determine the model complexity and avoid overfitting problem associated with conventional approaches based on the expectation maximization. The experimental results confirm the effectiveness of the proposed SI system.

  • 77.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Ma, Zhanyu
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Communication Theory.
    Bayesian Estimation of the von-Mises Fisher Mixture Model with Variational Inference2014In: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 36, no 9, p. 1701-1715Article in journal (Refereed)
    Abstract [en]

    This paper addresses the Bayesian estimation of the von-Mises Fisher (vMF) mixture model with variational inference (VI). The learning task in VI consists of optimization of the variational posterior distribution. However, the exact solution by VI does not lead to an analytically tractable solution due to the evaluation of intractable moments involving functional forms of the Bessel function in their arguments. To derive a closed-form solution, we further lower bound the evidence lower bound where the bound is tight at one point in the parameter distribution. While having the value of the bound guaranteed to increase during maximization, we derive an analytically tractable approximation to the posterior distribution which has the same functional form as the assigned prior distribution. The proposed algorithm requires no iterative numerical calculation in the re-estimation procedure, and it can potentially determine the model complexity and avoid the over-fitting problem associated with conventional approaches based on the expectation maximization. Moreover, we derive an analytically tractable approximation to the predictive density of the Bayesian mixture model of vMF distributions. The performance of the proposed approach is verified by experiments with both synthetic and real data.

  • 78.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Communication Theory. Institute of Communication Acoustics, Ruhr-Universität Bochum, Bochum, Germany.
    Martin, Rainer
    Institute of Communication Acoustics, Ruhr-Universität Bochum, Bochum.
    Leijon, Arne
    An investigation on mutual information for the linear predictive system and the extrapolation of speech signals2020In: Sprachkommunikation - 10. ITG-Fachtagung, VDE Verlag GmbH , 2020, p. 227-230Conference paper (Refereed)
    Abstract [en]

    Mutual information (MI) is an important information theoretic concept which has many applications in telecommunications, in blind source separation, and in machine learning. More recently, it has been also employed for the instrumental assessment of speech intelligibility where traditionally correlation based measures are used. In this paper, we address the difference between MI and correlation from the viewpoint of discovering dependencies between variables in the context of speech signals. We perform our investigation by considering the linear predictive approximation and the extrapolation of speech signals as examples. We compare a parametric MI estimation approach based on a Gaussian mixture model (GMM) with the k-nearest neighbor (KNN) approach which is a well-known non-parametric method available to estimate the MI. We show that the GMM-based MI estimator leads to more consistent results.

  • 79.
    Taghia, Jalil
    et al.
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Mohammadiha, Nasser
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    Leijon, Arne
    KTH, School of Electrical Engineering (EES), Sound and Image Processing.
    A variational Bayes approach to the underdetermined blind source separation with automatic determination of the number of sources2012In: Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on / [ed] IEEE, IEEE , 2012, p. 253-256Conference paper (Refereed)
    Abstract [en]

    In this paper, we propose a variational Bayes approach to the underdetermined blind source separation and show how a variational treatment can open up the possibility of determining the actual number of sources. The procedure is performed in a frequency bin-wise manner. In every frequency bin, we model the time-frequency mixture by a variational mixture of Gaussians with a circular-symmetric complex-Gaussian density function. In the Bayesian inference, we consider appropriate conjugate prior distributions for modeling the parameters of this distribution. The learning task consists of estimating the hyper-parameters characterizing the parameter distributions for the optimization of the variational posterior distribution. The proposed approach requires no prior knowledge on the number of sources in a mixture.

12 51 - 79 of 79
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf