Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Amharic-English information retrieval with pseudo relevance feedback
KTH, Skolan för informations- och kommunikationsteknik (ICT), Data- och systemvetenskap, DSV.
2007 (Engelska)Ingår i: CLEF2007 Working Notes, CEUR-WS , 2007Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

We describe cross language retrieval experiments using Amharic queries and English language document collection from our participation in the bilingual ad hoc track at the CLEF 2007. Two monolingual and eight bilingual runs were submitted. The bilingual experiments designed varied in terms of usage of long and short queries, presence of pseudo relevance feedback (PRF), and three approaches (maximal expansion, first-translation-given, manual) for word sense disambiguation. We used an Amharic-English machine readable dictionary (MRD) and an online Amharic-English dictionary in order to do the lookup translation of query terms. In utilizing both resources, matching query term bigrams were always given precedence over unigrams. Out of dictionary Amharic query terms were taken to be possible named entities in the language, and further filtering was attained through restricted fuzzy matching based on edit distance. The fuzzy matching was performed for each of these terms against automatically extracted English proper names. The Lemur toolkit for language modeling and information retrieval was used for indexing and retrieval. Although the experiments are too limited to draw conclusions from, the obtained results indicate that longer queries tend to perform similar to short ones, PRF improves performance considerably, and that queries tend to fare better when we use the first translation given in the MRD rather than using maximal expansion of terms by taking all the translations given in the MRD.

Ort, förlag, år, upplaga, sidor
CEUR-WS , 2007.
Serie
CEUR Workshop Proceedings, ISSN 1613-0073 ; 1173
Nyckelord [en]
Amharic, Cross language information retrieval, Query analysis
Nationell ämneskategori
Data- och informationsvetenskap
Identifikatorer
URN: urn:nbn:se:kth:diva-164388Scopus ID: 2-s2.0-84921973057OAI: oai:DiVA.org:kth-164388DiVA, id: diva2:805696
Konferens
2007 Working Notes for CLEF Workshop, CLEF 2007 - Co-located with the 11th European Conference on Digital Libraries, ECDL 2007, 19 September 2007 through 21 September 2007
Anmärkning

QC 20150416

Tillgänglig från: 2015-04-16 Skapad: 2015-04-16 Senast uppdaterad: 2018-01-11Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Scopus

Sök vidare i DiVA

Av författaren/redaktören
Argaw, Atelach Alemu
Av organisationen
Data- och systemvetenskap, DSV
Data- och informationsvetenskap

Sök vidare utanför DiVA

GoogleGoogle Scholar

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 238 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf