Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Automatic Bilingual Lexicon Acquisition Using Random Indexing of Parallel Corpora
Swedish Institute of Computer Science, Sweden.ORCID iD: 0000-0003-4042-4919
2005 (English)In: Natural Language Engineering, ISSN 1351-3249, E-ISSN 1469-8110, Vol. 11, no 3, 327-341 p.Article in journal (Refereed) Published
Abstract [en]

This paper presents a very simple and effective approach to using parallel corpora for automatic bilingual lexicon acquisition. The approach, which uses the Random Indexing vector space methodology, is based on finding correlations between terms based on their distributional characteristics. The approach requires a minimum of preprocessing and linguistic knowledge, and is efficient, fast and scalable. In this paper, we explain how our approach differs from traditional cooccurrence-based word alignment algorithms, and we demonstrate how to extract bilingual lexica using the Random Indexing approach applied to aligned parallel data. The acquired lexica are evaluated by comparing them to manually compiled gold standards, and we report overlap of around 60%. We also discuss methodological problems with evaluating lexical resources of this kind.

Place, publisher, year, edition, pages
2005. Vol. 11, no 3, 327-341 p.
Keyword [en]
Algorithms, Computational linguistics, Correlation methods, Mathematical models, Matrix algebra, Probability, Text processing, Vectors
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:kth:diva-116278DOI: 10.1017/S1351324905003876Scopus ID: 2-s2.0-25844468192OAI: oai:DiVA.org:kth-116278DiVA: diva2:775531
Note

QC 20150211

Available from: 2015-01-03 Created: 2013-01-16 Last updated: 2017-12-05Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Authority records BETA

Karlgren, Jussi

Search in DiVA

By author/editor
Karlgren, Jussi
In the same journal
Natural Language Engineering
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 83 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf