CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Defining Differentiable Neighborhoods in Stockholm by Clustering Apartments with Machine Learning
KTH, School of Electrical Engineering and Computer Science (EECS).
KTH, School of Electrical Engineering and Computer Science (EECS).
KTH, School of Electrical Engineering and Computer Science (EECS).
2018 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

With the rise of digital platforms for the real estate market in Sweden, and their record of transaction data, there is still a lack of proper utilization and presentation of available data. The traditional geographical city areas are usually too large and varying to perform accurate analysis on. This report explores the possibility of dividing central Stockholm’s predefined city areas into smaller submarkets using data-driven methods. The smaller submarkets would provide a more homogeneous description of their respective area and serve as a better basis for valuation estimators. The creation of the submarkets are done through clustering, a subsection of Machine Learning. Different clustering algorithms are attempted in order to test for their fit to the model. Results are evaluated by analyzing the variance of attributes within and between the clusters, ensuring that variance is low within and high between. The results are also compared to predefined city areas, in order to ascertain the improvement achieved with the data-driven model. The data output is presented graphically in Google Maps for a visual evaluation while also allowing ease-of-use for potential commercial customers. The results were an interactive map with differentiated and mostly non-overlapping clusters. The best clustering algorithm was Hierarchical clustering that lowered the internal variance by 33% and increased the external variance by 171% compared to predefined city areas. A potential future use of properly delineated submarkets could include higher precision valuation estimators or more relevant apartment recommendations for a company such as Booli.

Abstract [sv]

Trots den ökande användningen av digitala plattformar för bostadsmarknaden i Sverige samt mängden transaktionsdata som finns tillgänglig, så finns det få aktörer som utnyttjar detta fullt ut. De traditionella stadsområdena är ofta alltför stora och varierande för att utföra precisa analyser på. Denna rapport utforskar möjligheterna med att dela upp centrala Stockholms fördefinierade stadsområden till mindre submarknader genom datadrivna metoder. Dessa submarknader skulle ge en mer homogen beskrivning av deras respektive områden och fungera som en bättre utgångspunkt för prisvärdering.

Framtagningen av submarknaderna görs genom klustring, en del utav maskininlärning. Olika klustringsalgoritmer implementeras för att testa deras förklaringsvärde.

Resultaten utvärderades genom att analysera variansen av attributen inom och mellan klustren, sett till att variansen bör vara låg inom klustren och hög mellan klustren. Resultaten jämfördes även med förbestämda stadsområden, för att säkra den datadrivna modellens förbättring.

Data outputen är presenterad grafiskt i Google Maps för visuell utvärdering medan det även tillåter enkel användning för potentiella slutkonsumenter. Resultatet av detta är en interaktiv karta med differentierade och mestadels icke-överlappande kluster. Denna rapport fann att den bästa klustringsmodellen var Hierarchical klustring som hade 33% lägre varians inom klustren samt 171% högre varians mellan klustren jämfört med förbestämda stadsområden. En potentiell framtida användning av klustrade submarknader skulle kunna vara mer precisa prisvärderingar eller mer relevanta bostadsrekommendationer för bolag så som Booli.

Place, publisher, year, edition, pages
2018.
Series
TRITA-EECS-EX ; 2018:427
Keywords [en]
Clustering, Machine Learning, Housing market, Submarkets, Data visualization, Digital real estate platforms
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-240984OAI: oai:DiVA.org:kth-240984DiVA, id: diva2:1275710
Supervisors
Examiners
Available from: 2019-01-09 Created: 2019-01-07 Last updated: 2019-01-09Bibliographically approved

Open Access in DiVA

fulltext(1621 kB)61 downloads
File information
File name FULLTEXT01.pdfFile size 1621 kBChecksum SHA-512
c3a7c4cd8efe897c4d42fd2b3dbc9079d755c11cf95b72cef9b6c6f479114607f49e5384cd56c43f1fe9f7ce30c221555317fd9ae06f44af35d5a3a060d193c8
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 61 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 60 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf