kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Navigation in a simplified urban flow through deep reinforcement learning
KTH, School of Engineering Sciences (SCI), Engineering Mechanics, Fluid Mechanics.ORCID iD: 0000-0002-2797-349X
Independent Researcher, Oslo, 0854, Norway.
KTH, School of Engineering Sciences (SCI), Engineering Mechanics, Fluid Mechanics.ORCID iD: 0000-0001-6570-5499
2025 (English)In: Journal of Computational Physics, ISSN 0021-9991, E-ISSN 1090-2716, Vol. 538, article id 114194Article in journal (Refereed) Published
Abstract [en]

The increasing number of unmanned aerial vehicles (UAVs) in urban environments requires a strategy to minimize their environmental impact, both in terms of energy efficiency and noise reduction, and novel strategies for developing prediction models and optimization of flight planning are needed. Our goal is to develop deep reinforcement learning (DRL) algorithms capable of enabling the autonomous navigation of UAVs in urban environments, taking into account their complexity, and optimizing the trajectories to reduce both energy consumption and noise. Fluid flow simulations represent the environment in which the UAV navigates. The UAV is trained as an agent that interacts with an urban environment. In this work, the domain is represented by a two-dimensional flow field with obstacles, ideally representing buildings, extracted from a three-dimensional high-fidelity numerical simulation. The presented methodology, using PPO + LSTM cells, was validated by reproducing a simple but fundamental problem in navigation, namely the Zermelo's problem, which deals with the trajectory optimization of a vessel navigating in a turbulent flow. The current method shows a significant improvement with respect to both a vanilla PPO and a TD3 algorithm, with a success rate (SR) of the PPO + LSTM trained policy of 98.7 %, and a crash rate (CR) of 0.1 %, outperforming both PPO (SR = 75.6 %, CR = 18.6 %) and TD3 (SR = 77.4 % and CR = 14.5 %). This is the first step towards DRL strategies which will guide UAVs in a three-dimensional flow field using real-time signals, making the navigation efficient in terms of flight time and avoiding damages to the vehicle.

Place, publisher, year, edition, pages
Elsevier BV , 2025. Vol. 538, article id 114194
Keywords [en]
Deep reinforcement learning, Turbulent flow fields, UAVs, Urban flows
National Category
Computer Sciences Robotics and automation Fluid Mechanics
Identifiers
URN: urn:nbn:se:kth:diva-368752DOI: 10.1016/j.jcp.2025.114194ISI: 001521515400001Scopus ID: 2-s2.0-105008877387OAI: oai:DiVA.org:kth-368752DiVA, id: diva2:1990855
Note

QC 20250821

Available from: 2025-08-21 Created: 2025-08-21 Last updated: 2025-10-03Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Tonti, FedericaVinuesa, Ricardo

Search in DiVA

By author/editor
Tonti, FedericaVinuesa, Ricardo
By organisation
Fluid Mechanics
In the same journal
Journal of Computational Physics
Computer SciencesRobotics and automationFluid Mechanics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 38 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf