首页> 外文会议>Annual meeting of the American Society for Information Science and Technology >The Art of Creating an Informative Data Collection for Automated Deception Detection: A Corpus of Truths and Lies
【24h】

The Art of Creating an Informative Data Collection for Automated Deception Detection: A Corpus of Truths and Lies

机译:创建用于自动欺骗检测的信息数据集的艺术:真理与谎言集

获取原文

摘要

One of the novel research directions in Natural LanguageProcessing and Machine Learning involves creating anddeveloping methods for automatic discernment of deceptivemessages from truthful ones. Mistaking intentionallydeceptive pieces of information for authentic ones (true tothe writer’s beliefs) can create negative consequences, sinceour everyday decision-making, actions, and mood are oftenimpacted by information we encounter. Such research isvital today as it aims to develop tools for the automatedrecognition of deceptive, disingenuous or fake information(the kind intended to create false beliefs or conclusions inthe reader’s mind). The ultimate goal is to supporttruthfulness ratings that signal the trustworthiness of theretrieved information, or alert information seekers topotential deception. To proceed with this agenda, werequire elicitation techniques for obtaining samples of bothdeceptive and truthful messages from study participants invarious subject areas. A data collection, or a corpus oftruths and lies, should meet certain basic criteria to allowfor meaningful analysis and comparison of socio-linguisticbehaviors. In this paper we propose solutions and weighpros and cons of various experimental set-ups in the art ofcorpus building. The outcomes of three experimentsdemonstrate certain limitations with using onlinecrowdsourcing for data collection of this type.Incorporating motivation in the task descriptions, and therole of visual context in creating deceptive narratives areother factors that should be addressed in future efforts tobuild a quality dataset.
机译:自然语言研究的新方向之一 处理和机器学习涉及创建和 自动识别欺骗的方法 来自真实消息的消息。故意错误 真实信息的欺骗性信息(对 作者的信念)可能会带来负面后果,因为 我们的日常决策,行动和情绪经常 受我们遇到的信息影响。这样的研究是 今天至关重要,因为它旨在开发用于自动化的工具 识别欺骗,虚假或伪造的信息 (旨在在图表中创建错误的信念或结论的那种 读者的思想)。最终目标是支持 真实性评级表明企业的可信赖性 检索到的信息,或提醒信息搜索者 潜在的欺骗。为了继续进行这一议程,我们 需要激发技术以获取两者的样品 来自研究参与者的欺骗性和真实信息 各个学科领域。数据收集或语料库 真相与谎言,应符合一定的基本标准,以允许 对社会语言进行有意义的分析和比较 行为。在本文中,我们提出解决方案并权衡 各种艺术形式的实验装置的利弊 语料库建设。三个实验的结果 展示在线使用的某些局限性 众包以收集这种类型的数据。 在任务描述中加入动机,并且 视觉环境在创造欺骗性叙述中的作用是 在未来的努力中应解决的其他因素 建立质量数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号