Bibliographic attributes extraction with layer-upon-layer tagging
2007 (English)In: ICDAR 2007: Ninth International Conference On Document Analysis And Recognition, Vols I And II, Proceedings / [ed] Werner, B, 2007, 804-808 p.Conference paper (Refereed)
Bibliographic attributes extraction is an important research topic for digital libraries. In this paper we propose a rule-based method for bibliographic attributes extraction with Layer-upon-Layer Tagging (LLT). The method analyzes bibliographic attributes' appearances and punctuations to perform format and semantic taggings on two defined parsing layers. The method also resolves to specifically constructed lexicons to achieve high accuracy of semantic tagging. In the experimental evaluation on 1,000 reference strings, the accuracy of author tagging reaches to 96.8% and the accuracy Of whole reference tagging is 82.9%. The experimental results demonstrate that the proposed LLT method can tag bibliographic attributes in reference strings with high degree of accuracy.
Place, publisher, year, edition, pages
2007. 804-808 p.
Computer and Information Science
IdentifiersURN: urn:nbn:se:kth:diva-41147DOI: 10.1109/ICDAR.2007.4377026ISI: 000252162600161ScopusID: 2-s2.0-51149094559ISBN: 978-0-7695-2822-9OAI: oai:DiVA.org:kth-41147DiVA: diva2:443460
9th International Conference on Document Analysis and Recognition, ICDAR 2007; Curitiba; 23 September 2007 through 26 September 2007