kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 20) Show all publications
Nilsson, V., Samaddar, A., Madireddy, S. & Nyquist, P. (2024). REMEDI: Corrective Transformations for Improved Neural Entropy Estimation. In: International Conference on Machine Learning, ICML 2024: . Paper presented at 41st International Conference on Machine Learning, ICML 2024, Vienna, Austria, Jul 21 2024 - Jul 27 2024 (pp. 38207-38236). ML Research Press
Open this publication in new window or tab >>REMEDI: Corrective Transformations for Improved Neural Entropy Estimation
2024 (English)In: International Conference on Machine Learning, ICML 2024, ML Research Press , 2024, p. 38207-38236Conference paper, Published paper (Refereed)
Abstract [en]

Information theoretic quantities play a central role in machine learning. The recent surge in the complexity of data and models has increased the demand for accurate estimation of these quantities. However, as the dimension grows the estimation presents significant challenges, with existing methods struggling already in relatively low dimensions. To address this issue, in this work, we introduce REMEDI for efficient and accurate estimation of differential entropy, a fundamental information theoretic quantity. The approach combines the minimization of the cross-entropy for simple, adaptive base models and the estimation of their deviation, in terms of the relative entropy, from the data density. Our approach demonstrates improvement across a broad spectrum of estimation tasks, encompassing entropy estimation on both synthetic and natural data. Further, we extend important theoretical consistency results to a more generalized setting required by our approach. We illustrate how the framework can be naturally extended to information theoretic supervised learning models, with a specific focus on the Information Bottleneck approach. It is demonstrated that the method delivers better accuracy compared to the existing methods in Information Bottleneck. In addition, we explore a natural connection between REMEDI and generative modeling using rejection sampling and Langevin dynamics.

Place, publisher, year, edition, pages
ML Research Press, 2024
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-353945 (URN)2-s2.0-85203821749 (Scopus ID)
Conference
41st International Conference on Machine Learning, ICML 2024, Vienna, Austria, Jul 21 2024 - Jul 27 2024
Note

QC 20240926

Available from: 2024-09-25 Created: 2024-09-25 Last updated: 2024-09-26Bibliographically approved
Peletier, M., Gavish, N. & Nyquist, P. (2023). Large Deviations and Gradient Flows for the Brownian One-Dimensional Hard-Rod System. Potential Analysis, 58(1), 71-121
Open this publication in new window or tab >>Large Deviations and Gradient Flows for the Brownian One-Dimensional Hard-Rod System
2023 (English)In: Potential Analysis, ISSN 0926-2601, E-ISSN 1572-929X, Vol. 58, no 1, p. 71-121Article in journal (Refereed) Published
Abstract [en]

We study a system of hard rods of finite size in one space dimension, which move by Brownian noise while avoiding overlap. We consider a scaling in which the number of particles tends to infinity while the volume fraction of the rods remains constant; in this limit the empirical measure of the rod positions converges almost surely to a deterministic limit evolution. We prove a large-deviation principle on path space for the empirical measure, by exploiting a one-to-one mapping between the hard-rod system and a system of non-interacting particles on a contracted domain. The large-deviation principle naturally identifies a gradient-flow structure for the limit evolution, with clear interpretations for both the driving functional (an ‘entropy’) and the dissipation, which in this case is the Wasserstein dissipation. This study is inspired by recent developments in the continuum modelling of multiple-species interacting particle systems with finite-size effects; for such systems many different modelling choices appear in the literature, raising the question how one can understand such choices in terms of more microscopic models. The results of this paper give a clear answer to this question, albeit for the simpler one-dimensional hard-rod system. For this specific system this result provides a clear understanding of the value and interpretation of different modelling choices, while giving hints for more general systems.

Place, publisher, year, edition, pages
Springer Nature, 2023
Keywords
Brownian motion, Continuum limit, Hard-rod, Hard-sphere, Large deviations, Steric interaction, Volume exclusion
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-310625 (URN)10.1007/s11118-021-09933-0 (DOI)000672266000001 ()2-s2.0-85110699583 (Scopus ID)
Note

QC 20250327

Available from: 2022-04-06 Created: 2022-04-06 Last updated: 2025-03-27Bibliographically approved
Djehiche, B., Hult, H. & Nyquist, P. (2022). Importance Sampling for a Simple Markovian Intensity Model Using Subsolutions. ACM Transactions on Modeling and Computer Simulation, 32(2), 1-25, Article ID 14.
Open this publication in new window or tab >>Importance Sampling for a Simple Markovian Intensity Model Using Subsolutions
2022 (English)In: ACM Transactions on Modeling and Computer Simulation, ISSN 1049-3301, E-ISSN 1558-1195, Vol. 32, no 2, p. 1-25, article id 14Article in journal (Refereed) Published
Abstract [en]

This article considers importance sampling for estimation of rare-event probabilities in a specific collection of Markovian jump processes used for, e.g., modeling of credit risk. Previous attempts at designing importance sampling algorithms have resulted in poor performance and the main contribution of the article is the design of efficient importance sampling algorithms using subsolutions. The dynamics of the jump processes cause the corresponding Hamilton-Jacobi equations to have an intricate state-dependence, which makes the design of efficient algorithms difficult. We provide theoretical results that quantify the performance of importance sampling algorithms in general and construct asymptotically optimal algorithms for some examples. The computational gain compared to standard Monte Carlo is illustrated by numerical examples.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2022
Keywords
Large deviations, Monte Carlo, importance sampling, Markovian intensity models, credit risk
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-310757 (URN)10.1145/3502432 (DOI)000772649100007 ()2-s2.0-85127447384 (Scopus ID)
Note

QC 20220408

Available from: 2022-04-08 Created: 2022-04-08 Last updated: 2022-06-25Bibliographically approved
Budhiraja, A., Dupuis, P., Nyquist, P. & Wu, G.-J. (2022). Quasistationary distributions and ergodic control problems. Stochastic Processes and their Applications, 145, 143-164
Open this publication in new window or tab >>Quasistationary distributions and ergodic control problems
2022 (English)In: Stochastic Processes and their Applications, ISSN 0304-4149, E-ISSN 1879-209X, Vol. 145, p. 143-164Article in journal (Refereed) Published
Abstract [en]

We introduce and study the basic properties of two ergodic stochastic control problems associated with the quasistationary distribution (QSD) of a diffusion process X relative to a bounded domain. The two problems are in some sense dual, with one defined in terms of the generator associated with X and the other in terms of its adjoint. Besides proving wellposedness of the associated Hamilton-Jacobi- Bellman equations, we describe how they can be used to characterize important properties of the QSD. Of particular note is that the QSD itself can be identified, up to normalization, in terms of the cost potential of the control problem associated with the adjoint.

Place, publisher, year, edition, pages
Elsevier BV, 2022
Keywords
Quasistationary distribution, Diffusion process, Ergodic control, Hamilton-Jacobi-Bellman equation, Q-processes, Dirichlet eigenvalue problems
National Category
Mathematical Analysis
Identifiers
urn:nbn:se:kth:diva-312690 (URN)10.1016/j.spa.2021.12.004 (DOI)000789706700006 ()2-s2.0-85122007111 (Scopus ID)
Note

QC 20220524

Available from: 2022-05-24 Created: 2022-05-24 Last updated: 2022-06-25Bibliographically approved
Bierkens, J., Nyquist, P. & Schlottke, M. (2021). Large deviations for the empirical measure of the zig-zag process. The Annals of Applied Probability, 31(6)
Open this publication in new window or tab >>Large deviations for the empirical measure of the zig-zag process
2021 (English)In: The Annals of Applied Probability, ISSN 1050-5164, E-ISSN 2168-8737, Vol. 31, no 6Article in journal (Refereed) Published
Abstract [en]

The zig-zag process is a piecewise deterministic Markov process in position and velocity space. The process can be designed to have an arbitrary Gibbs type marginal probability density for its position coordinate, which makes it suitable for Monte Carlo simulation of continuous probability distributions. An important question in assessing the efficiency of this method is how fast the empirical measure converges to the stationary distribution of the process. In this paper we provide a partial answer to this question by characterizing the large deviations of the empirical measure from the stationary distribution. Based on the Feng-Kurtz approach, we develop an abstract framework aimed at encompassing piecewise deterministic Markov processes in position-velocity space. We derive explicit conditions for the zig-zag process to allow the Donsker-Varadhan variational formulation of the rate function, both for a compact setting (the torus) and one-dimensional Euclidean space. Finally we derive an explicit expression for the Donsker-Varadhan functional for the case of a compact state space and use this form of the rate function to address a key question concerning the optimal choice of the switching rate of the zig-zag process.

Place, publisher, year, edition, pages
Institute of Mathematical Statistics, 2021
Keywords
Large deviations, empirical measure, piecewise deterministic Markov process, zig-zag process
National Category
Probability Theory and Statistics
Research subject
Mathematics
Identifiers
urn:nbn:se:kth:diva-288530 (URN)10.1214/21-AAP1663 (DOI)000730108300008 ()2-s2.0-85121326052 (Scopus ID)
Note

QC 20211223

Available from: 2021-01-08 Created: 2021-01-08 Last updated: 2022-06-25Bibliographically approved
Ringqvist, C., Nyquist, P. & Hult, H. (2020). Infinite Swapping Algorithm for Training Restricted Boltzmann Machines. In: Monte Carlo and Quasi-Monte Carlo Methods: . Paper presented at MCQMC: International Conference on Monte Carlo and Quasi-Monte Carlo Methods in Scientific Computing, Rennes, France, July 1–6 (pp. 285-307). Springer Nature
Open this publication in new window or tab >>Infinite Swapping Algorithm for Training Restricted Boltzmann Machines
2020 (English)In: Monte Carlo and Quasi-Monte Carlo Methods, Springer Nature , 2020, p. 285-307Conference paper, Published paper (Refereed)
Abstract [en]

Given the important role latent variable models play, for example in statistical learning, there is currently a growing need for efficient Monte Carlo methods for conducting inference on the latent variables given data. Recently, Desjardins et al. (JMLR Workshop and Conference Proceedings: AISTATS 2010, pp. 145–152, 2010 [3]) explored the use of the parallel tempering algorithm for training restricted Boltzmann machines, showing considerable improvement over the previous state-of-the-art. In this paper we continue their efforts by comparing previous methods, including parallel tempering, with the infinite swapping algorithm, an MCMC method first conceived when attempting to optimise performance of parallel tempering (Dupuis et al. in J. Chem. Phys. 137, 2012 [7]), for the training task. We implement a Gibbs-sampling version of infinite swapping and evaluate its performance on a number of test cases, concluding that the algorithm enjoys better mixing properties than both persistent contrastive divergence and parallel tempering for complex energy landscapes associated with restricted Boltzmann machines.

Place, publisher, year, edition, pages
Springer Nature, 2020
Series
Springer Proceedings in Mathematics & Statistics ; 324
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-295215 (URN)10.1007/978-3-030-43465-6_14 (DOI)000871735000014 ()2-s2.0-85089429593 (Scopus ID)
Conference
MCQMC: International Conference on Monte Carlo and Quasi-Monte Carlo Methods in Scientific Computing, Rennes, France, July 1–6
Note

Part of book: ISBN 978-3-030-43465-6

QC 20210519

Available from: 2021-05-18 Created: 2021-05-18 Last updated: 2023-09-21Bibliographically approved
Doll, J., Dupuis, P. & Nyquist, P. (2018). A large deviation analysis of certain qualitative properties of parallel tempering and infinite swapping algorithms. Applied mathematics and optimization, 78(1), 103-144
Open this publication in new window or tab >>A large deviation analysis of certain qualitative properties of parallel tempering and infinite swapping algorithms
2018 (English)In: Applied mathematics and optimization, ISSN 0095-4616, E-ISSN 1432-0606, Vol. 78, no 1, p. 103-144Article in journal (Refereed) Published
Abstract [en]

Parallel tempering, or replica exchange, is a popular method for simulating complex systems. The idea is to run parallel simulations at different temperatures, and at a given swap rate exchange configurations between the parallel simulations. From the perspective of large deviations it is optimal to let the swap rate tend to infinity and it is possible to construct a corresponding simulation scheme, known as infinite swapping. In this paper we propose a novel use of large deviations for empirical measures for a more detailed analysis of the infinite swapping limit in the setting of continuous time jump Markov processes. Using the large deviations rate function and associated stochastic control problems we consider a diagnostic based on temperature assignments, which can be easily computed during a simulation. We show that the convergence of this diagnostic to its a priori known limit is a necessary condition for the convergence of infinite swapping. The rate function is also used to investigate the impact of asymmetries in the underlying potential landscape, and where in the state space poor sampling is most likely to occur.

Place, publisher, year, edition, pages
Springer, 2018
Keywords
Large deviatins, MCMC, parallel tempering, infinite swapping, ergodic control
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-198612 (URN)10.1007/s00245-017-9401-9 (DOI)000438412600004 ()2-s2.0-85011844953 (Scopus ID)
Note

QC 20181128

Available from: 2016-12-19 Created: 2016-12-19 Last updated: 2024-03-15Bibliographically approved
Nyquist, P. (2017). MODERATE DEVIATION PRINCIPLES FOR IMPORTANCE SAMPLING ESTIMATORS OF RISK MEASURES. Journal of Applied Probability, 54(2), 490-506
Open this publication in new window or tab >>MODERATE DEVIATION PRINCIPLES FOR IMPORTANCE SAMPLING ESTIMATORS OF RISK MEASURES
2017 (English)In: Journal of Applied Probability, ISSN 0021-9002, E-ISSN 1475-6072, Vol. 54, no 2, p. 490-506Article in journal (Refereed) Published
Abstract [en]

Importance sampling has become an important tool for the computation of extreme quantiles and tail-based risk measures. For estimation of such nonlinear functionals of the underlying distribution, the standard efficiency analysis is not necessarily applicable. In this paper we therefore study importance sampling algorithms by considering moderate deviations of the associated weighted empirical processes. Using a delta method for large deviations, combined with classical large deviation techniques, the moderate deviation principle is obtained for importance sampling estimators of two of the most common risk measures: value at risk and expected shortfall.

Place, publisher, year, edition, pages
Cambridge University Press, 2017
Keywords
Large deviation, moderate deviation, risk measure, empirical process, asymptotics, importance sampling, Monte Carlo
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-211029 (URN)10.1017/jpr.2017.13 (DOI)000404012100010 ()2-s2.0-85021080099 (Scopus ID)
Note

Not duplicate with DiVA 603119

QC 20230202

Available from: 2017-07-12 Created: 2017-07-12 Last updated: 2023-02-02Bibliographically approved
Nyquist, P. (2017). Moderate deviation principles for importance sampling estimators of risk measures. Journal of Applied Probability
Open this publication in new window or tab >>Moderate deviation principles for importance sampling estimators of risk measures
2017 (English)In: Journal of Applied Probability, ISSN 0021-9002, E-ISSN 1475-6072Article in journal (Refereed) Accepted
Abstract [en]

Importance sampling has become an important tool for the computation of tail-based risk measures. Since such quantities are often determined mainly by rare events standard Monte Carlo can be inefficient and importance sampling provides a way to speed up computations. This paper considers moderate deviations for the weighted empirical process, the process analogue of the weighted empirical measure, arising in importance sampling. The moderate deviation principle is established as an extension of existing results. Using a delta method for large deviations established by Gao and Zhao (Ann. Statist., 2011) together with classical large deviation techniques, the moderate deviation principle for the weighted empirical process is extended to functionals of the weighted empirical process which correspond to risk measures. The main results are moderate deviation principles for importance sampling estimators of the quantile function of a distribution and Expected Shortfall.

Keywords
Large deviations, moderate deviations, empirical processes, importance sampling, risk measures
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-117808 (URN)
Note

QCR 20161219

Available from: 2013-02-05 Created: 2013-02-05 Last updated: 2024-03-15Bibliographically approved
Doll, J., Dupuis, P. & Nyquist, P. (2017). Thermodynamic integration methods, infinite swapping and the calculation of generalized averages. Journal of Chemical Physics, 146
Open this publication in new window or tab >>Thermodynamic integration methods, infinite swapping and the calculation of generalized averages
2017 (English)In: Journal of Chemical Physics, ISSN 0021-9606, E-ISSN 1089-7690, Vol. 146Article in journal (Refereed) Published
Abstract [en]

In the present paper we examine the risk-sensitive and sampling issues associated with the problem of calculating generalized averages. By combining thermodynamic integration and Stationary Phase Monte Carlo techniques, we develop an approach for such problems and explore its utility for a prototypical class of applications.

Keywords
Statistical mechanics, Monte Carlo, infinite swapping, large deviations, stationary phase Monte Carlo
National Category
Probability Theory and Statistics Physical Chemistry Other Physics Topics
Research subject
Mathematics
Identifiers
urn:nbn:se:kth:diva-198614 (URN)10.1063/1.4979493 (DOI)000399073300014 ()28390386 (PubMedID)2-s2.0-85017098908 (Scopus ID)
Note

QC 20170125

Available from: 2016-12-19 Created: 2016-12-19 Last updated: 2024-03-15Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-8702-2293

Search in DiVA

Show all publications