kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Application of Simulation-assisted Machine Learning for Yard Departure Prediction
KTH, School of Architecture and the Built Environment (ABE), Civil and Architectural Engineering, Transport planning. (Train traffic and logistics)ORCID iD: 0000-0002-4945-3663
University of Texas at Austin, Department of Civil, Architectural and Environmental Engineering, Texas Railway Analysis & Innovation Node (TRAIN) .
University of Texas at Austin, Department of Civil, Architectural and Environmental Engineering, Texas Railway Analysis & Innovation Node (TRAIN) .ORCID iD: 0000-0002-2527-1320
KTH, School of Architecture and the Built Environment (ABE), Civil and Architectural Engineering, Transport planning. School of Innovation, Design and Technology, Malardalen University, Eskilstuna, Sweden. (Train traffic and logistics)ORCID iD: 0000-0003-1597-6738
2023 (English)Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

Increasing the modal share of rail freight is an ongoing goal in Europe and North America. Yards can play an important role in realizing this target by their reliable and predictable performance. We aim at predicting yard departures by implementing a simulation-assisted machine learning model via two general and step-wise concepts for including the predictors. The former adds all predictors at once, and the latter adds them per the availability or the sub-yard. The data used for training the model is a one-year real-world operational data set from a European hump yard and multiple two-year simulation data sets from a representative hump yard in North America. To the best of our knowledge, no previous research has attempted to implement a generalizable prediction model between the European and the North American contexts. The model is developed on a decision tree algorithm based on a 10-fold cross-validation process. Comparing the model performance on three data sets: the real-world, a baseline simulation, and an ultimate randomness simulation shows that the model has a similar performance in the first two data sets with a respective R-squared of 0.90 and 0.87, which shows high capturing of the variance in the data. However, adding large randomness in the simulation decreases the R-squared to 0.70. Results for the step-wise inclusion of the predictors are different for the real-world and simulation data. For the former, adding more operational predictors does not change the model performance, whereas for the latter, adding departure yard predictors increases the R-squared substantially. The global feature importance shows that for the real-world data almost all predictors contribute to a great extent to the predictions, with maximum planned length, departure week day, and the number of arriving trains as the most contributing ones, whereas for the simulation data, the departure yard predictors provide the largest contribution.

Place, publisher, year, edition, pages
2023.
Keywords [en]
Yards, machine learning, simulation, delay prediction, rail freight
National Category
Transport Systems and Logistics Computer Sciences
Research subject
Transport Science; Transport Science, Transport Systems; Järnvägsgruppen - Effektiva tågsystem för godstrafik; Järnvägsgruppen - Kapacitet
Identifiers
URN: urn:nbn:se:kth:diva-327017OAI: oai:DiVA.org:kth-327017DiVA, id: diva2:1757626
Conference
10th International Conference on Railway Operations Modelling and Analysis
Projects
FR8RAIL IIIPRATAShift2Rail
Funder
Swedish Transport Administration
Note

QC 20230517

Available from: 2023-05-17 Created: 2023-05-17 Last updated: 2023-05-17Bibliographically approved
In thesis
1. Application of Predictive Analytics for Shunting Yard Delays
Open this publication in new window or tab >>Application of Predictive Analytics for Shunting Yard Delays
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Increasing the modal share of rail freight transport is one of the main ways to achieve carbon neutrality in Europe. The perceived low reliability and predictability of rail freight services is one of the main challenges to overcome in reaching this target. Shunting yards play an important role in providing more reliable and predictable freight trains. Shunting yard departure deviations impact other trains on mixed-traffic railway networks. Predictable departures from shunting yards increase the overall predictability of freight train runs along the network.

The primary focus of this thesis is on how to apply data-driven approaches to increase the predictability of shunting yard departures. Descriptive analytics were used to provide enhanced insight into shunting yard departures, and predictive analytics were applied to develop shunting yard departure deviation prediction models. Finally, hybrid modeling was used to integrate the yard departure prediction model with other simulation models for wider application. The results from this thesis contribute to providing a deeper understanding of shunting yard departure deviations, interactions between shunting yards and the network through departure and arrival deviations, and how to model these deviations by applying data-driven approaches. These results from five published research papers are included and presented in this doctoral thesis.

Descriptive analytics methods are applied in papers I and II to explore the probability distribution of departure deviations and the impact of the network on departure delays. The results show that positive and negative departure deviations have different distributions for different shunting yards. Moreover, network usage fluctuations over shorter timespans impact departure delays, whereas no correlation is established between network impact, defined as congestion in the arrival yard, and departure delays.

Predictive analytics is applied in paper III by developing tree-based algorithms to classify the status of shunting yard departures. The departure status are imbalanced; the majority are early, and the minority are delayed. The results show that applying methods to overcome imbalanced data sets can improve the prediction of delayed departures.

The models developed in paper III are extended in papers IV and V to predict departure deviations in a combined modeling approach for two separate applications. In paper IV, a machine learning-assisted macro simulation model framework is introduced to integrate yard departure predictions into a macro simulation network model and predict the arrivals to the next yard. The results show improved prediction accuracy compared to a basic machine learning model and a baseline timetable model.

Finally, in paper V, the generalization of the yard departure prediction model is explored by applying a simulation-assisted machine learning modeling approach where the model is trained on real-world European yard data and North American simulation yard data. The results show the model has a notable generalized performance with both data types.

Abstract [sv]

Ett av de huvudsakliga målen för att uppnå koldioxidneutralitet i Europa är att öka den modala andelen av godstransporter på järnväg. En av de stora utmaningarna är att övervinna uppfattningen om att godstrafik på järnväg har en låg tillförlitlighet och förutsägbarhet. Gods- och rangerbangårdar har en viktig roll i att tillhandahålla godståg med högre tillförlitlighet och förutsägbarhet. Avvikelser från godstågens planerade avgångstider från godsbangårdar påverkar i förlängningen andra tåg i järnvägsnätet. En högre förutsägbarhet vad gäller godstågens avgångstider från godsbangårdar innebär även en högre förutsägbarhet för tågens körning i nätverket.

Huvudfokus i avhandlingen är att tillämpa datadrivna metoder för att öka förutsägbarheten i godstågens avgångar från godsbangårdar. Deskriptiv analys har använts för att ge en ökad insikt över fördelningen av avgångar från godsbangårdar. Prediktiv analys har tillämpats för att utveckla prediktionsmodeller för avgångar. Slutligen används hybridmodellering för att integrera (koppla ihop) en prediktiv avgångsmodell med andra simuleringsmodeller för större tillämpningar. Doktorsavhandlingen omfattar fem publicerade forskningsartiklar från vilka resultaten presenteras.

I artikel I och II tillämpas deskriptiva analysmetoder för att undersöka sannolikhetsfördelningar för avgångsavvikelser och nätverkets inverkan på avgångsförseningar. Resultaten visar att fördelningar för positiva och negativa avvikelser skiljer sig mellan olika godsbangårdar.  Dessutom påverkar fluktuationer i nätverkets utnyttjandegrad inom kortare tidsperioder avgångsförseningarna. Däremot påvisas ingen korrelation mellan nätverkets påverkan, här definierat som trängsel på ankomstbangården, och avgångsförseningar.

I artikel III tillämpas prediktiv analys genom att utveckla trädbaserade algoritmer för att klassificera status/tillstånden för avgångarna från en godsbangård. Avgångsstatus/avgångstillstånden är obalanserade, en majoritet av tågen är tidiga och en minoritet är försenade. Resultaten visar att prediktionen av försenade avgångar kan förbättras genom att tillämpa metoder för att hantera obalans i data.

De modeller som utvecklats i artikel III utvecklas och utökas vidare i artikel IV och V för att prediktera avgångsavvikelser med en kombinerad modelleringsmetod för två olika tillämpningar. I artikel IV introduceras ett koncept med en maskininlärningsassisterad makrosimuleringsmodell med syftet att integrera avgångsprediktioner från en godsbangård i en makroskopisk nätverkssimuleringsmodell och prediktera godstågens ankomster till nästa godsbangård. Resultaten indikerar en förbättring i prediktionsnoggrannhet jämfört med en grundläggande maskininlärningsmodell och en baslinjemodell för tidtabell.

I artikel V undersöks generaliserbarheten av avgångsprediktionsmodellen genom att tillämpa en ansats med en simuleringsassisterad maskininlärningsmodell och där modellen tränas på verklig data från godsbangårdar i Europa och simuleringsdata från Nordamerika. Resultaten visar att modellens prestanda generellt är god med båda datatyperna.  

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2023. p. 62
Series
TRITA-ABE-DLT ; 2322
Keywords
Shunting yards, train delays, machine learning, simulation, freight transport, Godsbangårdar, tåg förseningar, maskininlärning, simulering, godstransport
National Category
Transport Systems and Logistics
Research subject
Transport Science, Transport Systems
Identifiers
urn:nbn:se:kth:diva-327021 (URN)978-91-8040-610-9 (ISBN)
Public defence
2023-06-15, Kollegiesalen, Brinellvägen 8, KTH Campus, video conference link: https://kth-se.zoom.us/j/69650875724, Stockholm, 13:00 (English)
Opponent
Supervisors
Projects
Shift2RailFR8HUBFR8RAIL IIIPRATA
Funder
Swedish Transport Administration
Note

QC 20230522

Available from: 2023-05-22 Created: 2023-05-17 Last updated: 2023-05-29Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Minbashi, NiloofarBohlin, Markus

Search in DiVA

By author/editor
Minbashi, NiloofarDick, C. TylerBohlin, Markus
By organisation
Transport planning
Transport Systems and LogisticsComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 310 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf