首页> 外文OA文献 >EMPIRICAL METHODS FOR FINE-GRAINED OPINION EXTRACTION FROM TEXT
【2h】

EMPIRICAL METHODS FOR FINE-GRAINED OPINION EXTRACTION FROM TEXT

机译:从文本中提取细小意见的经验方法

摘要

Opinions are everywhere. The op/ed pages of newspapers, political blogs, and consumer websites like epinions.com are just some examples of the textual opinions available to readers. And there are many consumers who are interested in following these opinions - intelligence analysts who track the opinions of foreign countries, public relation firms who want to ensure positive opinions for their clients, pollsters who want to know the public's opinions about politicians, and companies who want to know customers' opinions about their products. The problem faced by all of these consumers of opinion is that there is such a wealth of text to process that it is hard to read it all. Central to processing the opinions in these text will be solving two specific problems - identifying expressions of opinion, and identifying their hierarchical structure. We demonstrate solutions involving empirical natural language processing techniques. Although empirical, data-driven methods such as these have become the norm in natural language processing, little work has been done in analyzing their impact on the reproducibility, efficiency, and effectiveness of research. We address two specific problems in this area. We introduce a lightweight computational workflow system to improve the reproducibility and efficiency of machine learning and natural language processing experiments. And we investigate the process of feature generation, setting out desiderata for an ideal process and exploring the effectiveness of several alternatives. Both are investigated in the context of the natural language learning tasks set out earlier.
机译:意见无处不在。报纸,政治博客和诸如epinions.com之类的消费类网站的操作页面仅是读者可以参考的一些文本意见的示例。而且,有很多消费者对遵循这些意见感兴趣—情报分析员跟踪外国的意见,希望确保为其客户提供正面意见的公共关系公司,想要了解公众对政客意见的民意测验者以及想了解客户对其产品的看法。所有这些意见的使用者都面临的问题是,要处理的文本如此丰富,以至于很难全部阅读。处理这些文本中的意见的中心将是解决两个特定的问题-识别意见的表达方式和识别其层次结构。我们演示了涉及经验自然语言处理技术的解决方案。尽管诸如此类的以数据为依据的经验方法已成为自然语言处理的规范,但在分析其对研究的可重复性,效率和有效性的影响方面所做的工作很少。我们解决了这方面的两个具体问题。我们引入了轻量级的计算工作流系统,以提高机器学习和自然语言处理实验的可重复性和效率。然后,我们研究了特征生成的过程,为理想的过程设置了desiderata,并探索了几种替代方法的有效性。两者都是在较早提出的自然语言学习任务的背景下进行研究的。

著录项

  • 作者

    Breck Eric;

  • 作者单位
  • 年度 2008
  • 总页数
  • 原文格式 PDF
  • 正文语种 en_US
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号