Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Multi Document Summarization.
KTH, School of Computer Science and Communication (CSC).
2011 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This thesis describes the development and deployment of a search based, automatic, multi-document summarizing system, and shows that technologies regularly used in search engines are applicable to the task of automatic summarization. The summarizer works by first indexing a set of documents as well as extracting sentences from them. The summarizer can then summarize document collections, which are dynamically constructed from the documents in the index. This is done as an addition to the Solr search server, which provides an interface to the Lucene index; the summarizer uses Lucene to find sentences which it can use to create summaries. The summarizer uses methods similar to search engines, and enjoys the same benefits in speed that modern search engines provide when summarizing documents.

Abstract [sv]

Den här rapporten beskriver utvecklingen av ett sökbaserat system för automatiska sammanfattningar av multipla dokument, och visar att metoder som vanligtvis används i sökmotorer är användbara vid automatisk sammanfattning. Summeraren fungerar genom att först indexera ett antal dokument och extrahera alla meningar ur dessa. Summeraren kan sedan summera dokumentsamlingar som har konstruerats av dokument från indexet. Detta görs via sökservern Solr, som tillhandahåller ett gränssnitt mot indexet Lucene; summeraren använder sig av Lucene för att hitta meningar som kan användas för att skapa sammanfattningar. Summeraren använder metoder som liknar dem i moderna sökmotorer, metoderna är lika snabba i summeringstillämpningen som vid sökningar.

Place, publisher, year, edition, pages
2011.
Series
Trita-CSC-E, ISSN 1653-5715 ; 2011:090
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-130712OAI: oai:DiVA.org:kth-130712DiVA: diva2:654159
Educational program
Master of Science in Engineering - Computer Science and Technology
Uppsok
Technology
Supervisors
Examiners
Available from: 2013-10-07 Created: 2013-10-07

Open Access in DiVA

No full text

Other links

http://www.nada.kth.se/utbildning/grukth/exjobb/rapportlistor/2011/rapporter11/hagerstrand_anton_11090.pdf
By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 361 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf