kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A deterministic policy gradient method for order execution and option hedging in the presence of market impact
KTH, School of Engineering Sciences (SCI), Mathematics (Dept.), Probability, Mathematical Physics and Statistics. SEB Group, Stockholm, Sweden..ORCID iD: 0000-0002-0067-4908
KTH, School of Engineering Sciences (SCI), Mathematics (Dept.).ORCID iD: 0000-0001-9210-121X
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control).ORCID iD: 0000-0002-4679-4673
SEB Group, Stockholm, Sweden..
Show others and affiliations
2024 (English)In: Journal of Financial Data Science, ISSN 2640-3943, Vol. 6, no 3, p. 81-114Article in journal (Refereed) Published
Abstract [en]

In this article, an iterative deterministic policy gradient method for finding optimal strategies in the presence of market impact is introduced. The derivation of the policy gradient sheds light on a proper way of handling the market impact of trades in the context of reinforcement learning. Similar to many machine learning methods, the proposed deterministic policy gradient method is based on mini-batch stochastic gradient descent optimization. The method is demonstrated to consistently perform well for several different objectives and market dynamics when applied to the financial applications of order execution and option hedging.

Place, publisher, year, edition, pages
With Intelligence LLC , 2024. Vol. 6, no 3, p. 81-114
National Category
Computational Mathematics
Identifiers
URN: urn:nbn:se:kth:diva-353470DOI: 10.3905/jfds.2024.1.164Scopus ID: 2-s2.0-85202532970OAI: oai:DiVA.org:kth-353470DiVA, id: diva2:1899145
Note

QC 20240924

Available from: 2024-09-19 Created: 2024-09-19 Last updated: 2024-11-20Bibliographically approved
In thesis
1. Topics on Machine Learning for Algorithmic Trading
Open this publication in new window or tab >>Topics on Machine Learning for Algorithmic Trading
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Recent advancements in machine learning have opened up new possibilities for algorithmic trading, enabling the optimization of trading strategies in complex market environments. This thesis aims to improve algorithmic trading methods by developing machine learning models for the realistic simulation of limit order books and the learning of optimal strategies. Consisting of three papers, the thesis combines theoretical insights with practical applications.

The first paper presents a generative model for the dynamic evolution of a limit order book, using recurrent neural networks. The model captures the complete dynamics of the limit order book by decomposing the probability of each transition of the limit order book into a product of conditional probabilities for order type, price level, order size, and time delay. Each of these conditional probabilities is modeled by a recurrent neural network. Additionally, the paper introduces several evaluation metrics for generative models related to order execution. The generative model is trained on both synthetic data generated by a Markov model and real data from the Nasdaq Stockholm exchange.

The second paper proposes an iterative deterministic policy gradient method for stochastic control problems in finance, which incorporates both temporary and permanent market impact. The method is based on a derived policy gradient theorem and uses mini-batch stochastic gradient descent for optimization. It is applied to both order execution and option hedging, demonstrating consistently strong performance across several objectives and market dynamics. 

The third paper studies a policy gradient method with parameter-based exploration, where a single deterministic policy is sampled at the beginning of an episode and used throughout the whole episode. A marginal equivalence between parameter-based and action-based exploration is shown, facilitating the adaption of previously established convergence results for policy gradient methods with action-based exploration. Convergence rates to first-order stationary points are derived under mild assumptions, and global convergence is established under an introduced Fisher-non-degenerate condition for parameter-based exploration.

Abstract [sv]

Den senaste tidens framsteg inom maskininlärning har skapat nya förutsättningar för algoritmisk handel och möjliggjort optimering av handelsstrategier i komplexa miljöer. Syftet med denna avhandling är att förbättra metoder för algorithmisk handel genom att utveckla modeller baserade på maskininlärning för realistisk simulering av orderböcker samt för lärande av optimala strategier. Avhandlingen består av tre artiklar och kombinerar teoretiska insikter med praktiska tillämpningar.

Den första artikeln utvecklar en generativ modell för den dynamiska utvecklingen av en orderbok baserad på rekurrenta neurala nätverk. Modellen fångar orderbokens fullständiga dynamik genom att bryta ner sannolikheten för varje förändring av orderboken i en produkt av betingade sannolikheter för ordertyp, prisnivå, orderstorlek och tidsfördröjning. Var och en av dessa betingade sannolikheter modelleras med ett rekurrent neuralt nätverk. Dessutom introducerar artikeln flera evalueringsmetoder för generativa modeller relaterade till orderexekvering. Den generativa modellen tränas framgångsrikt både för syntetisk data, genererad av en Markovmodell, och verklig data från Nasdaq Stockholm.

Den andra artikeln föreslår en iterativ deterministisk policygradientmetod för stokastiska kontrollproblem inom finans, som inkluderar både temporär och permanent marknadspåverkan. Metoden är baserad på ett härlett policy gradient teorem och använder stokastisk gradientnedstigning för optimering. Den tillämpas framgångsrikt på både orderexekvering och optionshedging och visar konsekvent bra resultat för varierande objektiv och marknadsdynamik.

Den tredje artikeln studerar en policygradientmetod med parameterbaserad utforskning, där en enda deterministisk policy väljs slumpmässigt i början av en episod och används under hela episoden. En ekvivalens mellan parameterbaserad och handlingsbaserad utforskning visas, vilket möjliggör anpassning av tidigare etablerade konvergensresultat för policygradientmetoder med handlingsbaserad utforskning. Konvergenshastigheter till första ordningens stationära punkter härleds under milda antaganden, och global konvergens etableras under ett introducerat villkor gällande Fisher-icke-degenerering för parameterbaserad utforskning.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2024. p. 251
Series
TRITA-SCI-FOU ; 2024:55
National Category
Probability Theory and Statistics
Research subject
Applied and Computational Mathematics; Applied and Computational Mathematics, Mathematical Statistics
Identifiers
urn:nbn:se:kth:diva-356595 (URN)978-91-8106-126-0 (ISBN)
Public defence
2024-12-16, F3, Lindstedtsvägen 26, Stockholm, 14:00 (English)
Opponent
Supervisors
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 2024-11-20

Available from: 2024-11-20 Created: 2024-11-20 Last updated: 2024-12-03Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Hultin, HannaHult, HenrikProutiere, Alexandre

Search in DiVA

By author/editor
Hultin, HannaHult, HenrikProutiere, Alexandre
By organisation
Probability, Mathematical Physics and StatisticsMathematics (Dept.)Decision and Control Systems (Automatic Control)
Computational Mathematics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 192 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf