Change search
ReferencesLink to record
Permanent link

Direct link
Multi Document Summarization.
KTH, School of Computer Science and Communication (CSC).
2011 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This thesis describes the development and deployment of a search based, automatic, multi-document summarizing system, and shows that technologies regularly used in search engines are applicable to the task of automatic summarization. The summarizer works by first indexing a set of documents as well as extracting sentences from them. The summarizer can then summarize document collections, which are dynamically constructed from the documents in the index. This is done as an addition to the Solr search server, which provides an interface to the Lucene index; the summarizer uses Lucene to find sentences which it can use to create summaries. The summarizer uses methods similar to search engines, and enjoys the same benefits in speed that modern search engines provide when summarizing documents.

Abstract [sv]

Den här rapporten beskriver utvecklingen av ett sökbaserat system för automatiska sammanfattningar av multipla dokument, och visar att metoder som vanligtvis används i sökmotorer är användbara vid automatisk sammanfattning. Summeraren fungerar genom att först indexera ett antal dokument och extrahera alla meningar ur dessa. Summeraren kan sedan summera dokumentsamlingar som har konstruerats av dokument från indexet. Detta görs via sökservern Solr, som tillhandahåller ett gränssnitt mot indexet Lucene; summeraren använder sig av Lucene för att hitta meningar som kan användas för att skapa sammanfattningar. Summeraren använder metoder som liknar dem i moderna sökmotorer, metoderna är lika snabba i summeringstillämpningen som vid sökningar.

Place, publisher, year, edition, pages
Trita-CSC-E, ISSN 1653-5715 ; 2011:090
National Category
Computer Science
URN: urn:nbn:se:kth:diva-130712OAI: diva2:654159
Educational program
Master of Science in Engineering - Computer Science and Technology
Available from: 2013-10-07 Created: 2013-10-07

Open Access in DiVA

No full text

Other links
By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 343 hits
ReferencesLink to record
Permanent link

Direct link