Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
User written movie reviews carry substantial amounts of movie related
features such as description of location, time period, genres, characters,
etc. Using natural language processing and topic modeling based
techniques, it is possible to extract features from movie reviews and find
movies with similar features. In this thesis, a feature extraction method
is presented and the use of the extracted features in finding similar
movies is investigated. We do the text pre-processing on a collection of
movie reviews. We then extract topics from the collection using topic
modeling techniques and store the topic distribution for each movie.
Similarity metrics such as Hellinger distance is then used to find movies
with similar topic distribution. Furthermore, the extracted topics are
used as an explanation during subjective evaluation. Experimental results
show that our extracted topics represent useful movie features and
that they can be used to find similar movies efficiently.
2014. , 53 p.