Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Semi-Supervised Multiple Disambiguation
KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS. SICS Sweden.ORCID iD: 0000-0003-1007-8533
KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.ORCID iD: 0000-0003-4516-7317
2015 (English)In: IEEE Computer Society Conference Publishing Services / [ed] IEEE, IEEE , 2015Conference paper, Published paper (Refereed)
Abstract [en]

Determining the true entity behind an ambiguousword is an NP-Hard problem known as Disambiguation. Previoussolutions often disambiguate a single ambiguous mention acrossmultiple documents. They assume each document contains onlya single ambiguous word and a rich set of unambiguous contextwords. However, nowadays we require fast disambiguation ofshort texts (like news feeds, reviews or Tweets) with few contextwords and multiple ambiguous words. In this research we focuson Multiple Disambiguation (MD) in contrast to Single Disambiguation(SD). Our solution is inspired by a recent algorithm developed for SD. The algorithm categorizes documents by first,transferring them into a graph and then, clustering the graphbased on its topological structure. We changed the graph-baseddocument-modeling of the algorithm, to account for MD. Also,we added a new parameter that controls the resolution of theclustering. Then, we used a supervised sampling approach formerging the clusters when appropriate. Our algorithm, comparedwith the original model, achieved 10% higher quality in termsof F1-Score using only 4% sampling from the dataset.

Place, publisher, year, edition, pages
IEEE , 2015.
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-184878DOI: 10.1109/Trustcom.2015.566OAI: oai:DiVA.org:kth-184878DiVA: diva2:917395
Conference
The 9th IEEE International Conference on Big Data Science and Engineering (IEEE BigDataSE-15)
Note

QC 20160407

Available from: 2016-04-06 Created: 2016-04-06 Last updated: 2016-05-30Bibliographically approved

Open Access in DiVA

fulltext(1233 kB)40 downloads
File information
File name FULLTEXT01.pdfFile size 1233 kBChecksum SHA-512
c5cba9aac7dc8e89c0df13d383ddef4b4da8b5c677a98690f9df0b7f1c9674f3d3c8fab0e9bb852ab5420ab2a4da46bc4385c56f5292f7e98982e855f361e051
Type fulltextMimetype application/pdf

Other links

Publisher's full textIEEE Xplore

Authority records BETA

Ghoorchian, Kambiz

Search in DiVA

By author/editor
Ghoorchian, KambizRahimian, FatemehGirdzijauskas, Sarunas
By organisation
Software and Computer systems, SCS
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 40 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 49 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf