首页> 外文会议>LREC-2012 >The MASC Word Sense Sentence Corpus
【24h】

The MASC Word Sense Sentence Corpus

机译:吉士士词义句子语料库

获取原文

摘要

The MASC project has produced a multi-genre corpus with multiple layers of linguistic annotation, together with a sentence corpus containing WordNet 3.1 sense tags for 1000 occurrences of each of 100 words produced by multiple annotators, accompanied by in depth inter-annotator agreement data. Here we give an overview of the contents of MASC and then focus on the word sense sentence corpus, describing the characteristics that differentiate it from other word sense corpora and detailing the inter-annotator agreement studies that have been performed on the annotations. Finally, we discuss the potential to grow the word sense sentence corpus through crowdsourcing and the plan to enhance the content and annotations of MASC through a community-based collaborative effort.
机译:MASC项目已经生产了多个语言辅助层的多类型语料库,以及包含WordNet 3.1的句子语料库,用于1000个由多个注释器产生的100个单词中的1000个单词,伴随着深度的共注入者协议数据。在这里,我们概述了MASC的内容,然后专注于单词索引句子语料库,描述将其与其他单词Sense Corpora区分开的特征,并详细说明已经在注释上执行的互联网协议研究。最后,我们讨论了通过众包和计划通过社区的协作努力提高MASC的内容和注释的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号