Texts rating products and services of all kind are omnipresent on the internet. They come in various languages and often in such a large amount that it is very time-consuming to get an overview of all reviews. The goal of this work is to facilitate the summarization of opinions written in multiple languages, exemplified on a corpus of English and Finnish reviews. To this purpose, we propose a framework that extracts aspect terms from reviews and groups them to multilingual topic clusters. For aspect extraction we work on texts of each language separately. We evaluate three methods, all based on neural networks. One of them is supervised, one unsupervised, based on an attention mechanism and one a rule-based hybrid method. We then group the extracted aspect terms into multilingual clusters, whereby we evaluate three different clustering methods and juxtapose a method that creates clusters from multilingual word embeddings with a method that first creates monolingual clusters for each language separately and then merges them. We report on our results from a variety of experiments, observing the best results when clustering aspect terms extracted by the supervised method, using the k-means algorithm on multilingual embeddings.
展开▼