首页> 外文会议>2012 IEEE Workshop on Spoken Language Technology. >Crowdsourcing the acquisition of natural language corpora: Methods and observations
【24h】

Crowdsourcing the acquisition of natural language corpora: Methods and observations

机译:众包自然语言语料库的获取:方法和观察

获取原文
获取原文并翻译 | 示例

摘要

We study the opportunity for using crowdsourcing methods to acquire language corpora for use in natural language processing systems. Specifically, we empirically investigate three methods for eliciting natural language sentences that correspond to a given semantic form. The methods convey frame semantics to crowd workers by means of sentences, scenarios, and list-based descriptions. We discuss various performance measures of the crowdsourcing process, and analyze the semantic correctness, naturalness, and biases of the collected language. We highlight research challenges and directions in applying these methods to acquire corpora for natural language processing applications.
机译:我们研究了使用众包方法获取用于自然语言处理系统的语言语料库的机会。具体来说,我们根据经验研究了三种方法,以得出与给定语义形式相对应的自然语言句子。该方法通过句子,场景和基于列表的描述将框架语义传达给人群工作者。我们讨论了众包过程的各种绩效指标,并分析了所收集语言的语义正确性,自然性和偏见。在应用这些方法获取自然语言处理应用的语料库时,我们重点介绍了研究挑战和方向。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号