kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Enhancing Short-Term Traffic Prediction for Large-Scale Transport Networks by Spatio-Temporal Clustering
KTH, School of Architecture and the Built Environment (ABE), Civil and Architectural Engineering, Transport planning. (Urban mobility group)ORCID iD: 0000-0002-8499-0843
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Congestion in large cities is responsible for extra travel time, noise, air pollution, CO2 emissions, and more. Transport is one of the main recognized contributors to global warming and climate change, which is getting increasing attention from authorities and societies around the world. Better utilization of existing resources by Intelligent Transport Systems (ITS) and digital technologies are recognized by the European Commission as technologies with enormous potential to lower the negative impacts associated with high traffic volumes in urban areas.

The main focus of this work is on short-term traffic prediction, which is an essential tool in ITS. In combination with providing information, it enables proactive decisions to decrease severity of congestion that occurs regularly or is caused by incidents. The main contribution of this work is to develop a methodological framework and prove its enhancing effects on short-term prediction in the context of large-scale transport networks. It is expected to contribute to more robust and accurate predictions of ITS in traffic management centers.

Traffic patterns in large-scale networks, including urban streets, can be heterogeneous during the day and from day-to-day. This work investigates spatio-temporal clustering of heterogeneous data sets to smaller, more homogeneous data sub-sets. This is expected to produce more robust, accurate, scalable, and cost-effective prediction models. 

This thesis is the collection of five papers that contribute to enhancing short-term traffic prediction in this context. The clustering is recognized to boost prediction performance in Papers II, III, IV, and V. Paper II considers network partitioning and the last three papers study day clustering. The prediction models used across included papers are naive historical mean prediction models and more advanced prediction models such as probabilistic principal component analysis (PPCA) and exponential smoothing. Paper I considers and facilitates floating car data (FCD) as a cost-effective opportunistic source of speed and travel time data with extensive network coverage.

Common practice in determining the number of clusters is to rely on internal evaluation indices, and these are very efficient but isolated from application. Paper IV tests this practice by also considering performance in short-term prediction application. Our results show that relying on these indices can lead to a loss of prediction accuracy of about 20% depending on the considered prediction model. Dimensionality reduction has a minimal effect on the resulting prediction performance, but clustering needs 20 times less computational time and only 0.1% of the original information.

Finally, in Paper V, we look at similarities of representative day clusters recognized by speed and flows. Furthermore, the interchangeability of speed day-type centroids for flow when predicting speeds has proven to be robust, which is not a case for predicting flows by speed day-type centroids and observations.

Abstract [sv]

Trängsel i storstäderna leder till extra restid, buller, luftföroreningar, koldioxidutsläpp med mera. Transporter är en av de främsta erkända bidragsgivarna till global uppvärmning och klimatförändringar, som får allt större uppmärksamhet från myndigheter och samhällen runt om i världen. Bättre utnyttjande av befintliga resurser genom intelligenta transportsystem (ITS) och digital teknik identifieras av Europeiska kommissionen som teknik med en enorm potential att minska ovanstående negativa effekter kopplade till stora trafikvolymer i stadsområden.

Huvudfokus i detta arbete ligger påkortsiktiga trafikprognoser, som är ett viktigt verktyg inom ITS. I kombination med informationsförsörjning möjliggör de proaktiva beslut för att minska omfattningen av trafikstockningar som uppstår regelbundet eller orsakas av incidenter. Det viktigaste bidraget i detta arbete är att utveckla ett metodologiskt ramverk och bevisa dess förbättrande effekter påkortsiktiga prognoser för storskaliga transportnät. Det förväntas bidra till mer robusta och exakta prognoser av ITS i trafikledningscentraler.

Trafikmönster i storskaliga nät, inklusive stadsgator, kan vara heterogena under dagen och från dag till dag. I detta arbete undersöks rumslig och temporal klustring av heterogena datamängder till mindre, mer homogena datamängder. Detta förväntas ge mer robusta, exakta, skalbara och kostnadseffektiva prognosmodeller.

Avhandlingen är en samling av fem artiklar som bidrar till att förbättra kortsiktiga trafikprognoser i detta sammanhang. Klustring påvisas öka prediktionsprestandan i artiklar II, III, IV och V. I artikel II beaktas nätverksuppdelning och i de tre sista dokumenten klusterbildning. De prediktionsmodeller som används i de inkluderade artiklarna är naiva historiska medelvärdesprediktionsmodeller och mer avancerade parametriska prediktionsmodeller, t.ex. probabilistisk principalkomponentanalys (PPCA) och exponentiell utjämning. I artikel I beaktas och utnyttjas probfordonsdata (FCD) som en kostnadseffektiv opportunistisk källa till hastighets- och restidsdata med omfattande nätverkstäckning.

Den vedertagna metoden för att bestämma antalet kluster är att förlita sig påinterna utvärderingsindex, och dessa är mycket effektiva men isolerade från tillämpningen. I uppsats IV testas denna praxis genom att även beakta prestandan i en tillämpning för korttidsprognoser. Våra resultat visar att om man förlitar sig pådessa index kan det leda till en förlust av prediktionsprestanda påcirka 20% beroende påvilken prognosmodell som används. Dimensionalitetsminskning har en minimal effekt påden resulterande prediktionsprestandan, men klusterbildning kräver 20 gånger mindre beräkningstid och endast 0,1% av den ursprungliga informationen.

Slutligen undersöker vi i artikel V likheterna mellan representativa dagskluster som bildas genom hastighet respektive flöden. Dessutom visar sig utbytbarheten av dagstypcentroider från hastigheter till flöden robust vid prediktion av hastigheter , vilket inte är fallet när det gäller prediktion av flöden.

Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2021. , p. 58
Series
TRITA-ABE-DLT ; 2143
Keywords [en]
short-term prediction, clustering, spatio-temporal clustering, day-types, speed-flow relationship, large-scale
National Category
Transport Systems and Logistics
Research subject
Transport Science, Transport Systems
Identifiers
URN: urn:nbn:se:kth:diva-304732ISBN: 978-91-8040-071-8 (print)OAI: oai:DiVA.org:kth-304732DiVA, id: diva2:1610284
Public defence
2021-12-09, F3, Lindstedsvägen 26, KTH Campus, Zoom: https://kth-se.zoom.us/j/66844011086, Stockholm, 13:00 (English)
Opponent
Supervisors
Available from: 2021-11-15 Created: 2021-11-10 Last updated: 2022-09-19Bibliographically approved
List of papers
1. Integrated framework for real-time urban network travel time prediction on sparse probe data
Open this publication in new window or tab >>Integrated framework for real-time urban network travel time prediction on sparse probe data
2018 (English)In: IET Intelligent Transport Systems, ISSN 1751-956X, E-ISSN 1751-9578, Vol. 12, no 1, p. 66-74Article in journal (Refereed) Published
Abstract [en]

The study presents the methodology and system architecture of an integrated urban road network travel time prediction framework based on low-frequency probe vehicle data. Intended applications include real-time network traffic management, vehicle routing and information provision. The framework integrates methods for receiving a stream of probe vehicle data, map matching and path inference, link travel time estimation, calibration of prediction model parameters and network travel time prediction in real time. The system design satisfies three crucial aspects: computational efficiency of prediction, internal consistency between components and robustness against noisy and missing data. Prediction is based on a multivariate hybrid method of probabilistic principal component analysis, which captures global correlation patterns between links and time intervals, and local smoothing, which considers local correlations among neighbouring links. Computational experiments for the road network of Stockholm, Sweden and probe data from taxis show that the system provides high accuracy for both peak and off-peak traffic conditions. The computational efficiency of the framework makes it capable of real-time prediction for large-scale networks. For links with large speed variations between days, prediction significantly outperforms the historical mean. Furthermore, prediction is reliable also for links with high proportions of missing data.

Place, publisher, year, edition, pages
Institution of Engineering and Technology, 2018
National Category
Transport Systems and Logistics
Identifiers
urn:nbn:se:kth:diva-219361 (URN)10.1049/iet-its.2017.0113 (DOI)000426045200009 ()2-s2.0-85041135912 (Scopus ID)
Note

QC 20180206

Available from: 2017-12-04 Created: 2017-12-04 Last updated: 2024-03-18Bibliographically approved
2. Spatio-Temporal Partitioning of Large Urban Networks for Travel Time Prediction
Open this publication in new window or tab >>Spatio-Temporal Partitioning of Large Urban Networks for Travel Time Prediction
2018 (English)In: 2018 21ST INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), IEEE , 2018, p. 1390-1395Conference paper, Published paper (Refereed)
Abstract [en]

The paper explores the potential of spatiotemporal network partitioning for travel time prediction accuracy and computational costs in the context of large-scale urban road networks (including motorways/freeways, arterials and urban streets). Forecasting in this context is challenging due to the complexity, heterogeneity, noisy data, unexpected events and the size of the traffic network. The proposed spatio-temporal network partitioning methodology is versatile, and can be applied for any source of travel time data and multivariate travel time prediction method. A case study of Stockholm, Sweden considers a network exceeding 11,000 links and uses taxi probe data as the source of travel times data. To predict the travel times the Probabilistic Principal Component Analysis (PPCA) is used. Results show that the spatio-temporal network partitioning provides a more appropriate bias-variance tradeoff, and that prediction accuracy and computational costs are improved by considering the proper number of clusters towards robust large-scale travel time prediction.

Place, publisher, year, edition, pages
IEEE, 2018
Series
IEEE International Conference on Intelligent Transportation Systems-ITSC, ISSN 2153-0009
National Category
Transport Systems and Logistics
Identifiers
urn:nbn:se:kth:diva-244586 (URN)10.1109/ITSC.2018.8569648 (DOI)000457881301060 ()2-s2.0-85060452125 (Scopus ID)978-1-7281-0323-5 (ISBN)
Conference
21st IEEE International Conference on Intelligent Transportation Systems (ITSC), NOV 04-07, 2018, Maui, HI
Note

QC 20190304

Available from: 2019-03-04 Created: 2019-03-04 Last updated: 2024-03-18Bibliographically approved
3. 3D Speed Maps and Mean Observations Vectors for Short-Term Urban Traffic Prediction
Open this publication in new window or tab >>3D Speed Maps and Mean Observations Vectors for Short-Term Urban Traffic Prediction
2019 (English)In: TRB Annual Meeting Online, Washington DC, US, 2019, p. 1-20Conference paper, Published paper (Refereed)
Abstract [en]

City-wide travel time prediction in real-time is an important enabler for efficient use of the road network. It can be used in traveler information to enable more efficient routing of individual vehicles as well as decision support for traffic management applications such as directed information campaigns or incident management. 3D speed maps have been shown to be a promising methodology for revealing day-to-day regularities of city-level travel times and possibly also for short-term prediction. In this paper, we aim to further evaluate and benchmark the use of 3D speed maps for short-term travel time prediction and to enable scenario-based evaluation of traffic management actions we also evaluate the framework for traffic flow prediction. The 3D speed map methodology is adapted to short-term prediction and benchmarked against historical mean as well as against Probabilistic Principal Component Analysis (PPCA). The benchmarking and analysis are made using one year of travel time and traffic flow data for the city of Stockholm, Sweden. The result of the case study shows very promising results of the 3D speed map methodology for short-term prediction of both travel times and traffic flows. The modified version of the 3D speed map prediction outperforms the historical mean prediction as well as the PPCA method. Further work includes an extended evaluation of the method for different conditions in terms of underlying sensor infrastructure, preprocessing and spatio-temporal aggregation as well as benchmarking against other prediction methods.

Place, publisher, year, edition, pages
Washington DC, US: , 2019
Keywords
3D speed map, short-term prediction, travel time prediction, traffic prediction, large-scale prediction, clustering, partitioning, spatio-temporal partitioning
National Category
Transport Systems and Logistics
Research subject
Transport Science
Identifiers
urn:nbn:se:kth:diva-250647 (URN)
Conference
Transportation research board annual meeting (TRB)
Note

QC 20190502

Available from: 2019-05-01 Created: 2019-05-01 Last updated: 2024-03-18Bibliographically approved
4. Revealing representative day-types in transport networks using traffic data clustering
Open this publication in new window or tab >>Revealing representative day-types in transport networks using traffic data clustering
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Recognition of spatio-temporal traffic patterns at the network-wide level plays an important role in data-driven intelligent transport systems (ITS) and is a basis for applications such as short-term prediction and scenario-based traffic management. Common practice in the transport literature is to rely on well-known general unsupervised machine-learning methods (e.g., k-means, hierarchical, spectral, DBSCAN) to select the most representative structure and number of day-types based solely on internal evaluation indices. These are easy to calculate but are limited since they only use information in the clustered dataset itself. In addition, the quality of clustering should ideally be demonstrated by external validation criteria, by expert assessment or the performance in its intended application. The main contribution of this paper is to test and compare the common practice of internal validation with external validation criteria represented by the application to short-term prediction, which also serves as a proxy for more general traffic management applications. When compared to external evaluation using short-term prediction, internal evaluation methods have a tendency to underestimate the number of representative day-types needed for the application. Additionally, the paper investigates the impact of using dimensionality reduction. By using just 0.1\% of the original dataset dimensions, very similar clustering and prediction performance can be achieved, with up to 20 times lower computational costs, depending on the clustering method. K-means and agglomerative clustering may be the most scalable methods, using up to 60 times fewer computational resources for very similar prediction performance to the p-median clustering.

Keywords
lustering, network-wide, day clustering, cluster validity, external indices, internal indices, prediction, dimensionality reduction
National Category
Transport Systems and Logistics
Research subject
Transport Science, Transport Systems; Transport Science
Identifiers
urn:nbn:se:kth:diva-304729 (URN)
Note

QC 20211116

Available from: 2021-11-10 Created: 2021-11-10 Last updated: 2022-06-25Bibliographically approved
5. Similarity and Interchangeability of Flow and Speed Data for Transport Network Day-Type Clustering and Prediction
Open this publication in new window or tab >>Similarity and Interchangeability of Flow and Speed Data for Transport Network Day-Type Clustering and Prediction
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Prediction of future traffic states is an essential part of traffic management and intelligent transportation systems. Previous work has shown that spatio-temporal clustering of traffic data such as flows or speeds into network day-types improves both the performance and the robustness of traffic predictions. Since some data types may not be available at a network-wide level, or only for certain periods, this paper investigates how similar such representative day-types are if based on different data types. The similarity of day-type clusters is evaluated with qualitative calendar visualization and two quantitative metrics, the Adjusted Mutual Information (AMI) which considers day-to-cluster assignments, and a new proposed Centroids Similarity Score (CSS) which compares centroids. The paper also explores the impact on flow and speed prediction performance of substituting one data type for the other in the clustering or classification phases. Using microwave sensor data from the Stockholm motorway network, our findings show that clusterings based on flows and speeds and across a range of clustering methods have reasonably high similarity. CSS is found to be a more relevant similarity indicator than AMI in the prediction application context. By capturing more relevant traffic state information, flow-based clustering and classification are robust for both flow and speed predictions, while speed-based clustering significantly degrades flow prediction performance.

Keywords
clustering, pattern recognition, machine-learning, day type, intelligent transportation systems, traffic prediction, short-term prediction, speed-flow relationship
National Category
Transport Systems and Logistics
Research subject
Transport Science; Transport Science, Transport Systems
Identifiers
urn:nbn:se:kth:diva-304731 (URN)
Note

QC 20211116

Available from: 2021-11-10 Created: 2021-11-10 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

summary(6104 kB)633 downloads
File information
File name SUMMARY01.pdfFile size 6104 kBChecksum SHA-512
a72242e72a812239c94163800e7a828c7bed0f1e4e365cb4a065d14d813cfc0ddf2822d9a29deb7131bd093215ff98b5a24c85938582df7f8be197fd4462d19a
Type fulltextMimetype application/pdf

Authority records

Cebecauer, Matej

Search in DiVA

By author/editor
Cebecauer, Matej
By organisation
Transport planning
Transport Systems and Logistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 0 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1212 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf