Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Punctuality prediction: combined probability approach and random forest modelling with railway delay statistics in Sweden
KTH, School of Architecture and the Built Environment (ABE), Urban Planning and Environment, System Analysis and Economics. KTH, School of Architecture and the Built Environment (ABE), Centres, Centre for Transport Studies, CTS.
2019 (English)In: Proceedings 2019 IEEE Intelligent Transportation Systems Conference, ITSC 2019, Institute of Electrical and Electronics Engineers Inc. , 2019, p. 2797-2802Conference paper, Published paper (Refereed)
Abstract [en]

Understanding the distribution and propagation of train delay is crucial for railway management. This paper combines the interpretability of logistic regression models with the robustness and accuracy of Random Forest models to create a combined model which was applied to predict punctuality. The data consists of relative timetable deviation of train movement for all stations, as well as punctuality observations at destination stations. The data was recorded for both passenger and freight trains in Sweden between year 2017 and 2018.The data consists of many policy and categorical variables such as train operator which are known to indirectly effect delay risk, but are labeled as insignificant in classical regression making their coefficients unstable and difficult to interpret. For this reason, the study has applied logistic regression model with the variables of interest such as train type as well as first registered delay (relative deviation compared with timetable) along with "bagging" of Random Forest capturing indirect or/and sensitive predictors. This semi-parametric logistic regression model was trained using 2017 data and was accurate and robust when tested using the 2018 data. It has shown to be capable of handling the delays caused by unforeseen disruptions such as abnormal weather in the test year. In this paper we show that the semi-parametric model has significantly better prediction performance than linear models, Weibull distributions, Binomial logistic regression and Random Forest alone. Furthermore, the semiparametric model maintains its interpretability whilst producing accurate predictions with new data.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc. , 2019. p. 2797-2802
Keywords [en]
Decision trees, Intelligent systems, Intelligent vehicle highway systems, Railroad transportation, Railroads, Regression analysis, Scheduling, Weibull distribution, Binomial logistic regressions, Categorical variables, Logistic Regression modeling, Logistic regression models, Prediction performance, Probability approach, Relative deviations, Semi-parametric modeling, Forecasting
National Category
Transport Systems and Logistics
Identifiers
URN: urn:nbn:se:kth:diva-268042DOI: 10.1109/ITSC.2019.8916892ISI: 000521238102135Scopus ID: 2-s2.0-85076815387ISBN: 9781538670248 (print)OAI: oai:DiVA.org:kth-268042DiVA, id: diva2:1417270
Conference
2019 IEEE Intelligent Transportation Systems Conference, ITSC 2019, Auckland, New Zealand, October 27-30, 2019
Note

QC 20200327

Available from: 2020-03-27 Created: 2020-03-27 Last updated: 2020-04-30Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopusConference websiteConference proceedings

Authority records BETA

Persson, Christer

Search in DiVA

By author/editor
Persson, Christer
By organisation
System Analysis and EconomicsCentre for Transport Studies, CTS
Transport Systems and Logistics

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 14 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf