Automatic Prediction of Text Aesthetics and Interestingness

机译：文本美学和趣味性的自动预测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper investigates the problem of automated text aesthetics prediction. The availability of user generated content and ratings, e.g. Flickr, has induced research in aesthetics prediction for non-text domains, particularly for photographic images. This problem, however, has yet not been explored for the text domain. Due to the very subjective nature of text aesthetics, it is difficult to compile human annotated data by methods such as crowd sourcing with a fair degree of inter-annotator agreement. The availability of the Kindle "popular highlights" data has motivated us to compile a dataset comprised of human annotated aesthetically pleasing and interesting text passages. We then undertake a supervised classification approach to predict text aesthetics by constructing real-valued feature vectors from each text passage. In particular, the features that we use for this classification task are word length, repetitions, polarity, part-of-speech, semantic distances; and topic generality and diversity. A traditional binary classification approach is not effective in this case because non-highlighted passages surrounding the highlighted ones do not necessarily represent the other extreme of unpleasant quality text. Due to the absence of real negative class samples, we employ the MC algorithm, in which training can be initiated with instances only from the positive class. On each successive iteration the algorithm selects new strong negative samples from the unlabeled class and retrains itself. The results show that the mapping convergence (MC) algorithm with a Gaussian and a linear kernel used for the mapping and convergence phases, respectively, yields the best results, achieving satisfactory accuracy, precision and recall values of about 74%, 42% and 54% respectively.

机译：本文研究了自动文本美学预测的问题。用户生成的内容和评分的可用性，例如Flickr引发了针对非文本领域（尤其是摄影图像）的美学预测的研究。但是，尚未针对文本域探索此问题。由于文本美学的非常主观的性质，很难通过诸如众包之间具有相当程度的批注者协议的方式来编译人类批注数据。 Kindle“热门集锦”数据的可用性促使我们编制了一个数据集，该数据集由带有人类注释的美学上令人愉悦和有趣的文字段落组成。然后，我们采用监督分类方法，通过从每个文本段落中构造实值特征向量来预测文本的美观程度。特别地，我们用于此分类任务的特征是单词长度，重复次数，极性，词性，语义距离;以及主题的普遍性和多样性。在这种情况下，传统的二进制分类方法无效，因为围绕突出显示的文本的非突出显示的段落不一定代表不愉快的质量文本的另一个极端。由于没有真正的负面类样本，我们采用了MC算法，在该算法中，只能使用正面类的实例来开始训练。在每次连续迭代中，算法都会从未标记的类别中选择新的强负样本并对其进行重新训练。结果表明，将高斯和线性核分别用于映射和收敛阶段的映射收敛（MC）算法产生了最佳结果，获得了令人满意的准确性，准确性和查全率，分别为74％，42％和54 ％分别。

著录项

来源
《International conference on computational linguistics》|2014年|905-916|共12页
会议地点
作者
Debasis Ganguly; Johannes Leveling; Gareth J.F. Jones;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Dataset for the Aesthetic Value Automatic Prediction [J] . Nereida Rodriguez-Fernandez, Iria Santos, Alvaro Torrente Proceedings . 2019,第1期

机译：DataSet用于审美值自动预测
2. Comparing visual descriptors and automatic rating strategies for video aesthetics prediction [J] . Hernandez-Garcia A., Fernandez-Martinez F., Diaz-de-Maria F. Signal Processing. Image Communication: A Publication of the the European Association for Signal Processing . 2016,第Null期

机译：比较视觉描述符和自动评级策略以进行视频美感预测
3. Evaluating a variety of text-mined features for automatic protein function prediction with GOstruct [J] . Christopher S Funk, Indika Kahanda, Asa Ben-Hur, Journal of Biomedical Semantics . 2015,第S1期

机译：评估多种文本挖掘功能，以使用GOstruct进行蛋白质功能自动预测
4. Automatic Prediction of Text Aesthetics and Interestingness [C] . Debasis Ganguly, Johannes Leveling, Gareth J.F. Jones International conference on computational linguistics . 2014

机译：自动预测文本美学和兴趣
5. High Level Describable Attributes for Predicting Aesthetics and Interestingness. [D] . Dhar, Sagnik. 2010

机译：预测美学和趣味性的高级可描述属性。
6. Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports [O] . Ayoub Bagheri, T. Katrien J. Groenhof, Folkert W. Asselbergs, 2021

机译：主要心血管事件复发的自动预测：胸部X射线报告的文本挖掘研究
7. Automatic prediction of text aesthetics and interestingness [O] . Ganguly Debasis, Leveling Johannes, Jones Gareth J.F. 2014

机译：自动预测文本美学和趣味性

Automatic Prediction of Text Aesthetics and Interestingness

摘要

著录项

相似文献

相关主题

期刊订阅