首页> 外文期刊>Procedia Computer Science >A New Method to Identify Short-Text Authors Using Combinations of Machine Learning and Natural Language Processing Techniques
【24h】

A New Method to Identify Short-Text Authors Using Combinations of Machine Learning and Natural Language Processing Techniques

机译:结合机器学习和自然语言处理技术识别短文作者的新方法

获取原文
       

摘要

Identifying authors by their style of writing is a very challenging task. This problem has several applications, one of which is to identify fake online reviews written by spam accounts. The existence of such fake reviews degrades the credibility of the whole review collection, hence these fake reviews should be identified and removed. This process, however, needs to be automated since it is impossible to perform it manually in large review collections. Current authorship identification approaches identify authors based on large-scale texts such as documents. For this reason, these methods do not scale well to short texts such as online reviews that have limited features to learn from. This paper introduces a new method of author identification in short texts using combinations of machine learning algorithms and natural language processing techniques. The experiments we conducted on Yelp reviews gave promising results.
机译:通过写作风格来识别作者是一项非常艰巨的任务。该问题有多种应用,其中之一是识别垃圾邮件帐户撰写的虚假在线评论。此类虚假评论的存在降低了整个评论集合的信誉,因此应识别并删除这些虚假评论。但是,由于无法在大型评论集中手动执行此过程,因此该过程需要自动化。当前的作者身份识别方法是基于诸如文档之类的大型文本来识别作者。因此,这些方法无法很好地扩展到短文本(例如,在线评论),这些文本的学习功能有限。本文介绍了一种结合机器学习算法和自然语言处理技术的短文本作者识别方法。我们在Yelp评论上进行的实验给出了可喜的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号