On developing robust models for favourability analysis: Model choice, feature sets and imbalanced data

Peter C.R. Lane; Daoud Clarke; Paul Hender

首页> 外文期刊>Decision support systems >On developing robust models for favourability analysis: Model choice, feature sets and imbalanced data

【24h】

On developing robust models for favourability analysis: Model choice, feature sets and imbalanced data

机译：在开发用于偏好分析的可靠模型时：模型选择，功能集和不平衡数据

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Locating documents carrying positive or negative favourability is an important application within media analysis. This article presents some empirical results on the challenges facing a machine-learning approach to this kind of opinion mining. Some of the challenges include the often considerable imbalance in the distribution of positive and negative samples, changes in the documents over time, and effective training and evaluation procedures for the models. This article presents results on three data sets generated by a media-analysis company, classifying documents in two ways: detecting the presence of favourability, and assessing negative vs. positive favourability. We describe our experiments in developing a machine-learning approach to automate the classification process. We explore the effect of using five different types of features, the robustness of the models when tested on data taken from a later time period, and the effect of balancing the input data by undersampling. We find varying choices for the optimum classifier, feature set and training strategy depending on the task and data set.

机译：查找具有正面或负面偏爱的文档是媒体分析中的重要应用。本文针对这种观点挖掘的机器学习方法所面临的挑战提供了一些实证结果。一些挑战包括正负样品的分配经常不平衡，文档随时间变化以及模型的有效培训和评估程序。本文介绍了一家媒体分析公司生成的三个数据集的结果，这些文档以两种方式对文档进行分类：检测是否存在有利性，以及评估负面与正面有利性。我们在开发一种机器学习方法以自动化分类过程的过程中描述了我们的实验。我们探讨了使用五种不同类型的功能的效果，在较晚时间段对数据进行测试时模型的健壮性以及通过欠采样来平衡输入数据的效果。我们根据任务和数据集找到最佳分类器，功能集和训练策略的不同选择。

著录项

来源
《Decision support systems》 |2012年第4期|p.712-718|共7页
作者
Peter C.R. Lane; Daoud Clarke; Paul Hender;
展开▼
作者单位

School of Computer Science. University of Hertfordshire, College Lane. Hatfield AL10 9AB, Hertfordshire, UK;

School of Computer Science. University of Hertfordshire, College Lane. Hatfield AL10 9AB, Hertfordshire, UK,Metrica, Banner Street, London EC1V 9BJ, UK;

Metrica, Banner Street, London EC1V 9BJ, UK;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
bayesian models; favourability analysis; imbalanced data; machine learning; sentiment analysis; support-vector machines;

机译：贝叶斯模型有利性分析;数据不平衡;机器学习情绪分析;支持向量机;
入库时间 2022-08-18 02:13:54

相似文献

外文文献
中文文献
专利

1. A choice model with a diverging choice set for POI data analysis [J] . Lu Xiaoling, Zhao Junlong, Chen Yu, Statistics and Its Interface . 2016,第3期

机译：带有不同选择集的POI数据分析选择模型
2. A choice model with a diverging choice set for POI data analysis [J] . Xiaoling Lu, Junlong Zhao, Yu Chen, Statistics and Its Interface . 2016,第3期

机译：带有不同选择集的POI数据分析选择模型
3. Robust Strengthening and Westward Shift of the Tropical Pacific Walker Circulation during 1979-2012: A Comparison of 7 Sets of Reanalysis Data and 26 CMIP5 Models [J] . Ma Shuangmei, Zhou Tianjun Journal of Climate . 2016,第9期

机译：1979-2012年热带太平洋沃克环流的强健加强和西移：7组再分析数据和26种CMIP5模型的比较
4. Developing Robust Models for Favourability Analysis [C] . Daoud Clarke, Peter Lane, Paul Hender Workshop on computational approaches to subjectivity and sentiment analysis . 2011

机译：开发用于偏好分析的鲁棒模型
5. Performance Evaluation of Choice Set Generation Algorithms for Modeling Truck Route Choice: Insights from Large Streams of Truck-GPS Data [D] . Tahlyan, Divyakant 2018

机译：用于卡车路线选择建模的选择集生成算法的性能评估：大量卡车GPS数据流的见解
6. Prediction Is a Balancing Act: Importance of Sampling Methods to Balance Sensitivity and Specificity of Predictive Models Based on Imbalanced Chemical Data Sets [O] . Priyanka Banerjee, Frederic O. Dehnbostel, Robert Preissner 2018

机译：预测是一种平衡行为：基于不平衡化学数据集的采样方法对平衡预测模型的敏感性和特异性的重要性
7. On developing robust models for favourability analysis : Model choice, feature sets and imbalanced data [O] . Lane, Peter, Clarke, Daoud, Hender, Paul 2012

机译：关于开发可靠性分析的稳健模型：模型选择，功能集和不平衡数据

On developing robust models for favourability analysis: Model choice, feature sets and imbalanced data

摘要

著录项

相似文献

相关主题

期刊订阅