Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Indexing strategies for Swedish full text retrieval under different user scenarios
University College of Borås.ORCID iD: 0000-0003-0229-3073
2007 (English)In: Information Processing & Management, ISSN 0306-4573, E-ISSN 1873-5371, Vol. 43, no 1, 81-102 p.Article in journal (Refereed) Published
Abstract [en]

This paper deals with Swedish full text retrieval and the problem of morphological variation of query terms in the document database. The effects of combination of indexing strategies with query terms on retrieval effectiveness were studied. Three of five tested combinations involved indexing strategies that used conflation, in the form of normalization. Further, two of these three combinations used indexing strategies that employed compound splitting. Normalization and compound splitting were performed by SWETWOL, a morphological analyzer for the Swedish language. A fourth combination attempted to group related terms by right hand truncation of query terms. The four combinations were compared to each other and to a baseline combination, where no attempt was made to counteract the problem of morphological variation of query terms in the document database. The five combinations were evaluated under six different user scenarios, where each scenario simulated a certain user type. The four alternative combinations outperformed the baseline, for each user scenario. The truncation combination had the best performance under each user scenario. The main conclusion of the paper is that normalization and right hand truncation (performed by a search expert) enhanced retrieval effectiveness in comparison to the baseline. The performance of the three combinations of indexing strategies with query terms based on normalization was not far below the performance of the truncation combination. (c) 2006 Elsevier Ltd. All rights reserved.

Place, publisher, year, edition, pages
2007. Vol. 43, no 1, 81-102 p.
National Category
Information Studies
Identifiers
URN: urn:nbn:se:kth:diva-171391DOI: 10.1016/j.ipm.2006.03.003ISI: 000241539500007OAI: oai:DiVA.org:kth-171391DiVA: diva2:843566
Note

NR 20150817

Available from: 2015-07-29 Created: 2015-07-29 Last updated: 2017-12-04Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Authority records BETA

Ahlgren, Per

Search in DiVA

By author/editor
Ahlgren, Per
In the same journal
Information Processing & Management
Information Studies

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 43 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf