The phenomenon of big data makes managing, processing, and extracting valuable information from the Web an increasingly challenging task. As such, the abundance of user-generated content with opinions about products or brands requires appropriate tools in order to be able to capture consumer sentiment. Such tools can be used to aggregate content by means of sentiment summarization techniques, extracting text segments that reflect the overall sentiment of a text in a compressed form. We explore what features distinguish relevant from irrelevant text segments in terms of the extent to which they reflect the overall sentiment of conversational documents. In our empirical study on a collection of Dutch conversational documents, we find that text segments with opinions, segments with arguments supporting these opinions, segments discussing aspects of the subject of a text, and relatively long sentences are key indicators for text segments that summarize the sentiment conveyed by a text as a whole.
展开▼