kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Publications (10 of 12) Show all publications
Zea, E., Laudato, M. & Andén, J. (2025). Introduction of the boostlet transform for acoustic signal processing. In: Proceedings of the 11th EAA Annual European Conference on Acoustics and Noise Control Engineering: . Paper presented at Forum Acusticum/Euronoise 2025, Malaga, Spain, June 23rd to 26th 2025 (pp. 1-8). European Acoustics Association (EAA)
Open this publication in new window or tab >>Introduction of the boostlet transform for acoustic signal processing
2025 (English)In: Proceedings of the 11th EAA Annual European Conference on Acoustics and Noise Control Engineering, European Acoustics Association (EAA), 2025, p. 1-8Conference paper, Published paper (Other academic)
Abstract [en]

This paper introduces the boostlet transform to analyze and reconstruct spatiotemporal acoustic fields measured in 2D space-time. The transform builds upon the insight that sparse multi-scale representations learned from natural wavefields perform geometric transformations that preserve the dispersion relation. The boostlet transform decomposes a spatiotemporal wavefield using a collection of wavelet-like functions parametrized by dilations, hyperbolic rotations, and translations in space-time. From a physical viewpoint, boostlets encompass global and localized waveforms with variable band-limited frequency and phase-speed content. We show transform applications of wavefront segmentation and sparse reconstruction of room impulse responses. In particular, we find that boostlet decompositions excel at representing localized wavefront phenomena typical of the early part of such room recordings. At the same time, plane waves perform equally as well as or better than boostlets in the late part.

Place, publisher, year, edition, pages
European Acoustics Association (EAA), 2025
Keywords
acoustic signal processing, boostlets, space-time, multi-scale representations, sparse reconstruction
National Category
Fluid Mechanics Signal Processing
Research subject
Engineering Mechanics
Identifiers
urn:nbn:se:kth:diva-366194 (URN)
Conference
Forum Acusticum/Euronoise 2025, Malaga, Spain, June 23rd to 26th 2025
Funder
Swedish Research Council, 2020-04668
Note

QC 20250728

Available from: 2025-07-04 Created: 2025-07-04 Last updated: 2025-07-28Bibliographically approved
Häggbom, M., Karlsmark, M. & Andén, J. (2025). Mean-Field Microcanonical Gradient Descent. In: Proceedings of the 28th International Conference on Artificial Intelligence and Statistics, AISTATS 2025: . Paper presented at 28th International Conference on Artificial Intelligence and Statistics, AISTATS 2025, Mai Khao, Thailand, May 3 2025 - May 5 2025 (pp. 5185-5193). ML Research Press, 258
Open this publication in new window or tab >>Mean-Field Microcanonical Gradient Descent
2025 (English)In: Proceedings of the 28th International Conference on Artificial Intelligence and Statistics, AISTATS 2025, ML Research Press , 2025, Vol. 258, p. 5185-5193Conference paper, Published paper (Refereed)
Abstract [en]

Microcanonical gradient descent is a sampling procedure for energy-based models allowing for efficient sampling of distributions in high dimension. It works by transporting samples from a high-entropy distribution, such as Gaussian white noise, to a low-energy region using gradient descent. We put this model in the framework of normalizing flows, showing how it can often overfit by losing an unnecessary amount of entropy in the descent. As a remedy, we propose a mean-field microcanonical gradient descent that samples several weakly coupled data points simultaneously, allowing for better control of the entropy loss while paying little in terms of likelihood fit. We study these models in the context of stationary time series and 2D textures.

Place, publisher, year, edition, pages
ML Research Press, 2025
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-370313 (URN)2-s2.0-105014328045 (Scopus ID)
Conference
28th International Conference on Artificial Intelligence and Statistics, AISTATS 2025, Mai Khao, Thailand, May 3 2025 - May 5 2025
Note

QC 20250925

Available from: 2025-09-25 Created: 2025-09-25 Last updated: 2025-09-25Bibliographically approved
Zea, E., Laudato, M. & Andén, J. (2025). Sparse wavefield reconstruction and denoising with boostlets. In: Proceedings of the 15th International Conference on Sampling Theory and Applications (SampTA), Vienna, Austria, July 28-Aug 1, 2025.: . Paper presented at International Conference on Sampling Theory and Applications (SampTA) (pp. 1-5). New York, USA: IEEE, Article ID 77.
Open this publication in new window or tab >>Sparse wavefield reconstruction and denoising with boostlets
2025 (English)In: Proceedings of the 15th International Conference on Sampling Theory and Applications (SampTA), Vienna, Austria, July 28-Aug 1, 2025., New York, USA: IEEE, 2025, p. 1-5, article id 77Conference paper, Published paper (Refereed)
Abstract [en]

Boostlets are spatiotemporal functions that decompose nondispersive wavefields into a collection of localized waveforms parametrized by dilations, hyperbolic rotations, and translations. We study the sparsity properties of boostlets and find that the resulting decompositions are significantly sparser than those of other state-of-the-art representation systems, such as wavelets and shearlets. This translates into improved denoising performance when hard-thresholding the boostlet coefficients. The results suggest that boostlets offer a natural framework for sparsely decomposing wavefields in unified space–time.

Place, publisher, year, edition, pages
New York, USA: IEEE, 2025
Series
2025 International Conference on Sampling Theory and Applications (SampTA), ISSN 2831-5480, E-ISSN 2694-0108
Keywords
wavefields, sparse reconstruction, denoising, multi-scale representations, boostlets
National Category
Fluid Mechanics Signal Processing Computational Mathematics
Research subject
Engineering Mechanics; Applied and Computational Mathematics
Identifiers
urn:nbn:se:kth:diva-369265 (URN)10.1109/SampTA64769.2025.11133531 (DOI)979-8-3315-0251-5 (ISBN)979-8-3315-0250-8 (ISBN)
Conference
International Conference on Sampling Theory and Applications (SampTA)
Funder
Swedish Research Council, 2020-04668
Note

QC 20250904

Available from: 2025-09-02 Created: 2025-09-02 Last updated: 2025-09-04Bibliographically approved
Zea, E., Brandão, E., Nolan, M., Cuenca, J., Andén, J. & Svensson, U. P. (2023). Sound absorption estimation of finite porous samples with deep residual learning. Journal of the Acoustical Society of America, 154(4), 2321-2332
Open this publication in new window or tab >>Sound absorption estimation of finite porous samples with deep residual learning
Show others...
2023 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 154, no 4, p. 2321-2332Article in journal (Refereed) Published
Abstract [en]

This work proposes a method to predict the sound absorption coefficient of finite porous absorbers using a residual neural network and a single-layer microphone array. The goal is to mitigate the discrepancies between predicted and measured data due to the finite-size effect for a wide range of rectangular absorbers with varying dimensions and flow resistivity and for various source-receiver locations. Data for training, validation, and testing are generated with a boundary element model consisting of a baffled porous layer on a rigid backing using the Delany–Bazley–Miki model. In effect, the network learns relevant features from the array pressure amplitude to predict the sound absorption as if the porous material were infinite. The method’s performance is quantified with the error between the predicted and theoretical sound absorption coefficients and compared with the two-microphone method. For array distances close to the porous sample, the proposed method performs at least as well as the two-microphone method and significantly better than it for frequencies below 400 Hz and small absorber sizes (e.g., 20 x 20 cm2). The significance of the study lies in the possibility of measuring sound absorption on-site in the presence of strong edge diffraction.

Place, publisher, year, edition, pages
Acoustical Society of America (ASA), 2023
National Category
Fluid Mechanics Probability Theory and Statistics
Research subject
Vehicle and Maritime Engineering; Applied and Computational Mathematics
Identifiers
urn:nbn:se:kth:diva-338244 (URN)10.1121/10.0021333 (DOI)001085116800003 ()37843379 (PubMedID)2-s2.0-85174925611 (Scopus ID)
Funder
Swedish Research Council, 2020-04668
Note

QC 20231017

Available from: 2023-10-17 Created: 2023-10-17 Last updated: 2025-02-09Bibliographically approved
Warrick, P. A., Lostanlen, V., Eickenberg, M., Homsi, M. N., Campoy Rodriguez, A. & Andén, J. (2022). Arrhythmia classification of 12-lead and reduced-lead electrocardiograms via recurrent networks, scattering, and phase harmonic correlation. Physiological Measurement, 43(9), Article ID 094002.
Open this publication in new window or tab >>Arrhythmia classification of 12-lead and reduced-lead electrocardiograms via recurrent networks, scattering, and phase harmonic correlation
Show others...
2022 (English)In: Physiological Measurement, ISSN 0967-3334, E-ISSN 1361-6579, Vol. 43, no 9, article id 094002Article in journal (Refereed) Published
Abstract [en]

We describe an automatic classifier of arrhythmias based on 12-lead and reduced-lead electrocardiograms. Our classifier comprises four modules: scattering transform (ST), phase harmonic correlation (PHC), depthwise separable convolutions (DSC), and a long short-term memory (LSTM) network. It is trained on PhysioNet/Computing in Cardiology Challenge 2021 data. The ST captures short-term temporal ECG modulations while the PHC characterizes the phase dependence of coherent ECG components. Both reduce the sampling rate to a few samples per typical heart beat. We pass the output of the ST and PHC to a depthwise-separable convolution layer (DSC) which combines lead responses separately for each ST or PHC coefficient and then combines resulting values across all coefficients. At a deeper level, two LSTM layers integrate local variations of the input over long time scales. We train in an end-to-end fashion as a multilabel classification problem with a normal and 25 arrhythmia classes. Lastly, we use canonical correlation analysis (CCA) for transfer learning from 12-lead ST and PHC representations to reduced-lead ones. After local cross-validation on the public data from the challenge, our team 'BitScattered' achieved the following results: 0.682 +/- 0.0095 for 12-lead; 0.666 +/- 0.0257 for 6-lead; 0.674 +/- 0.0185 for 4-lead; 0.661 +/- 0.0098 for 3-lead; and 0.662 +/- 0.0151 for 2-lead.

Place, publisher, year, edition, pages
IOP Publishing, 2022
Keywords
electrocardiography, scattering transform, phase harmonic correlation, canonical correlation analysis, convolutional neural networks, long short-term memory networks
National Category
Cardiology and Cardiovascular Disease
Identifiers
urn:nbn:se:kth:diva-319098 (URN)10.1088/1361-6579/ac77d1 (DOI)000852329400001 ()35688143 (PubMedID)2-s2.0-85138128248 (Scopus ID)
Note

QC 20220926

Available from: 2022-09-26 Created: 2022-09-26 Last updated: 2025-02-10Bibliographically approved
Langfield, C., Carmichael, J., Wright, G., Andén, J. & Singer, A. (2022). Representing Steerable Bases for cryo-EM in ASPIRE. In: 2022 IEEE 18TH INTERNATIONAL CONFERENCE ON E-SCIENCE (ESCIENCE 2022): . Paper presented at IEEE 18th International Conference on E-Science (E-Science), OCT 10-14, 2022, Salt Lake City, UT (pp. 417-418). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Representing Steerable Bases for cryo-EM in ASPIRE
Show others...
2022 (English)In: 2022 IEEE 18TH INTERNATIONAL CONFERENCE ON E-SCIENCE (ESCIENCE 2022), Institute of Electrical and Electronics Engineers (IEEE) , 2022, p. 417-418Conference paper, Published paper (Refereed)
Abstract [en]

An introduction to the mathematical problem of cryo-electron microscopy (cryo-EM) is given, along with an overview of ASPIRE, an open-source Python package for processing cryo-EM image data. ASPIRE uses unique Fourier basis representations for images of cryo-EM particles. The challenge of representing these mathematical structures within the ASPIRE codebase was addressed by building an extensible class hierarchy using mixins.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2022
Series
Proceeding IEEE International Conference on e-Science (e-Science), ISSN 2325-372X
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-324633 (URN)10.1109/eScience55777.2022.00066 (DOI)000927625900052 ()2-s2.0-85145433832 (Scopus ID)
Conference
IEEE 18th International Conference on E-Science (E-Science), OCT 10-14, 2022, Salt Lake City, UT
Note

Part of proceedings: ISBN 978-1-6654-6124-5

QC 20230309

Available from: 2023-03-09 Created: 2023-03-09 Last updated: 2023-03-09Bibliographically approved
Warrick, P. A., Lostanlen, V., Eickenberg, M., Homsi, M. N., Rodriguez, A. C. & Andén, J. (2021). Arrhythmia Classification of Reduced-Lead Electrocardiograms by Scattering- Recurrent Networks. In: 2021 COMPUTING IN CARDIOLOGY (CINC): . Paper presented at Conference on Computing in Cardiology (CinC), 12-15 September, 2021, Brno, Czech Republic. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Arrhythmia Classification of Reduced-Lead Electrocardiograms by Scattering- Recurrent Networks
Show others...
2021 (English)In: 2021 COMPUTING IN CARDIOLOGY (CINC), Institute of Electrical and Electronics Engineers (IEEE) , 2021Conference paper, Published paper (Refereed)
Abstract [en]

We describe an automatic classijier ofarrythmias based on 12- lead and reduced-lead electrocardiograms. Our classijier composes the scattering transform (ST) and a long short-term memory (LSTM) network. It is trained on PhysioNet/Computing in Cardiology Challenge 2021 data. The ST captures short-term temporal ECG modulations while reducing its sampling rate to a few samples per typical heart beat. We pass the output of the ST to a depthwise-separable convolution layer which combines lead responses separately for each ST coefficient and then combines resulting values across ST coefficients. At a deeper level, 2 LSTM layers integrate local variations of the input over long time scales. We train in an end-to-end fashion as a multilabel classijication problem with a normal and 25 arrhythmia classes. We used canonical correlation analysis (CCA) for transfer learning from 12-lead ST representations to reduced-lead ones. For 12-, 6-, 4-, 3- and 2-leads, team "BitScattered" Challenge metrics on the hidden validation set were 0.46, 0.44, 0.45, 0.46 and 0.43; and on the hidden test set were 0.10,0.11,0.10,0.10 and 0.10, respectively, ranking 34th on the hidden test set.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021
Series
Computing in Cardiology Conference, ISSN 2325-8861
Keywords
Cardiology, Classification (of information), Diseases, Long short-term memory, Arrhythmia classification, Arrythmias, Automatic classifiers, Memory network, PhysioNet, Recurrent networks, Sampling rates, Scattering transforms, Test sets, Transform coefficients, Electrocardiography
National Category
Cardiology and Cardiovascular Disease Bioinformatics and Computational Biology
Identifiers
urn:nbn:se:kth:diva-315829 (URN)10.23919/CinC53138.2021.9662908 (DOI)000821955000197 ()2-s2.0-85124762505 (Scopus ID)
Conference
Conference on Computing in Cardiology (CinC), 12-15 September, 2021, Brno, Czech Republic
Note

Part of proceedings: ISBN 978-1-6654-7916-5

QC 20220721

Available from: 2022-07-21 Created: 2022-08-16 Last updated: 2025-02-10Bibliographically approved
Shih, Y.-h., Wright, G., Andén, J., Blaschke, J. & Barnett, A. H. (2021). cuFINUFFT: a load-balanced GPU library for general-purpose nonuniform FFTs. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW): . Paper presented at 35th IEEE International Parallel and Distributed Processing Symposium (IPDPS), JUN 17-21, 2021, Portland, OR (pp. 688-697). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>cuFINUFFT: a load-balanced GPU library for general-purpose nonuniform FFTs
Show others...
2021 (English)In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Institute of Electrical and Electronics Engineers (IEEE) , 2021, p. 688-697Conference paper, Published paper (Refereed)
Abstract [en]

Nonuniform fast Fourier transforms dominate the computational cost in many applications including image reconstruction and signal processing. We thus present a general-purpose GPU-based CUDA library for type 1 (nonuniform to uniform) and type 2 (uniform to nonuniform) transforms in dimensions 2 and 3, in single or double precision. It achieves high performance for a given user-requested accuracy, regardless of the distribution of nonuniform points, via cache-aware point reordering, and load-balanced blocked spreading in shared memory. At low accuracies, this gives on-GPU throughputs around 109 nonuniform points per second, and (even including hostdevice transfer) is typically 4-10x faster than the latest parallel CPU code FINUFFT (at 28 threads). It is competitive with two established GPU codes, being up to 90x faster at high accuracy and/or type 1 clustered point distributions. Finally we demonstrate a 5-12x speedup versus CPU in an X-ray diffraction 3D iterative reconstruction task at 10(-12) accuracy, observing excellent multi-GPU weak scaling up to one rank per GPU.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021
Keywords
Nonuniform FFT, GPU, load balancing
National Category
Signal Processing
Identifiers
urn:nbn:se:kth:diva-302675 (URN)10.1109/IPDPSW52791.2021.00105 (DOI)000689576200084 ()2-s2.0-85114440464 (Scopus ID)
Conference
35th IEEE International Parallel and Distributed Processing Symposium (IPDPS), JUN 17-21, 2021, Portland, OR
Note

QC 20210930

Available from: 2021-09-30 Created: 2021-09-30 Last updated: 2022-06-25Bibliographically approved
Zea, E., Brandão, E., Nolan, M., Andén, J., Cuenca, J. & Svensson, U. P. (2021). Learning the finite size effect for in-situ absorption measurement. In: : . Paper presented at Euronoise 2021 (e-Congress), Madeira, Portugal, October 25-27, 2021 (pp. 1477-1486).
Open this publication in new window or tab >>Learning the finite size effect for in-situ absorption measurement
Show others...
2021 (English)Conference paper, Published paper (Other academic)
Abstract [en]

In this paper we propose the use of neural networks to predict the sound absorption coefficient spectra of finite porous samples with microphone arrays. The main goal is to train a model that can effectively mitigate the errors caused by the finite size effect. A convolutional neural network architecture is used to map the array data to the absorption coefficient at five frequencies. The training, validation and test data are numerically produced with a boundary element method; modelling a baffled, locally reacting porous absorber on a rigid backing with a Delany–Bazley–Miki model, for varying sample size, thickness, flow resistivity, incidence angle and frequency. The strength of using machine learning in this context is that no hypotheses are made about the sound field or the absorber, as the networks learn the necessary relationships from the data. We show that the network approximates well the absorption coefficient, as if the sample was infinite, in a wide range of cases. 

Keywords
Sound absorption, in-situ measurement, convolutional neural networks, finite size effect, Delany– Bazley–Miki model
National Category
Fluid Mechanics Computer Sciences Probability Theory and Statistics
Research subject
Vehicle and Maritime Engineering; Applied and Computational Mathematics; Computer Science
Identifiers
urn:nbn:se:kth:diva-304029 (URN)
Conference
Euronoise 2021 (e-Congress), Madeira, Portugal, October 25-27, 2021
Funder
Swedish Research Council, 2020-04668
Note

QC 20211103

Available from: 2021-10-26 Created: 2021-10-26 Last updated: 2025-02-09Bibliographically approved
Lostanlen, V., El-Hajj, C., Rossignol, M., Lafay, G., Andén, J. & Lagrange, M. (2021). Time-frequency scattering accurately models auditory similarities between instrumental playing techniques. EURASIP Journal on Audio, Speech, and Music Processing, 2021(1), Article ID 3.
Open this publication in new window or tab >>Time-frequency scattering accurately models auditory similarities between instrumental playing techniques
Show others...
2021 (English)In: EURASIP Journal on Audio, Speech, and Music Processing, ISSN 1687-4714, E-ISSN 1687-4722, Vol. 2021, no 1, article id 3Article in journal (Refereed) Published
Abstract [en]

Instrumentalplaying techniques such as vibratos, glissandos, and trills often denote musical expressivity, both in classical and folk contexts. However, most existing approaches to music similarity retrieval fail to describe timbre beyond the so-called "ordinary" technique, use instrument identity as a proxy for timbre quality, and do not allow for customization to the perceptual idiosyncrasies of a new subject. In this article, we ask 31 human participants to organize 78 isolated notes into a set of timbre clusters. Analyzing their responses suggests that timbre perception operates within a more flexible taxonomy than those provided by instruments or playing techniques alone. In addition, we propose a machine listening model to recover the cluster graph of auditory similarities across instruments, mutes, and techniques. Our model relies on joint time-frequency scattering features to extract spectrotemporal modulations as acoustic features. Furthermore, it minimizes triplet loss in the cluster graph by means of the large-margin nearest neighbor (LMNN) metric learning algorithm. Over a dataset of 9346 isolated notes, we report a state-of-the-art average precision at rank five (AP@5) of 99.0%+/- 1. An ablation study demonstrates that removing either the joint time-frequency scattering transform or the metric learning algorithm noticeably degrades performance.

Place, publisher, year, edition, pages
Springer Nature, 2021
Keywords
Audio databases, Audio similarity, Continuous wavelet transform, Demodulation, Distance learning, Human-computer interaction, Music information retrieval
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-289494 (URN)10.1186/s13636-020-00187-z (DOI)000607607700001 ()33488686 (PubMedID)2-s2.0-85099090941 (Scopus ID)
Note

QC 20210203

Available from: 2021-02-03 Created: 2021-02-03 Last updated: 2022-06-25Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-3377-813x

Search in DiVA

Show all publications