首页> 外文会议>9th International conference on language resources and evaluation >The CUHK Discourse TreeBank for Chinese: Annotating Explicit Discourse Connectives for the Chinese TreeBank
【24h】

The CUHK Discourse TreeBank for Chinese: Annotating Explicit Discourse Connectives for the Chinese TreeBank

机译:CUHK话语TreeBank为中文:注释中国TreeBank的明确话语联系

获取原文

摘要

The lack of open discourse corpus for Chinese brings limitations for many natural language processing tasks. In this work, we present the first open discourse treebank for Chinese, namely, the Discourse Treebank for Chinese (DTBC). At the current stage, we annotated explicit intra-sentence discourse connectives, their corresponding arguments and senses for all 890 documents of the Chinese Treebank 5. We started by analysing the characteristics of discourse annotation for Chinese, adapted the annotation scheme of Penn Discourse Treebank 2 (PDTB2) to Chinese language while maintaining the compatibility as far as possible. We made adjustments to 3 essential aspects according to the previous study of Chinese linguistics. They are sense hierarchy, argument scope and semantics of arguments. Agreement study showed that our annotation scheme could achieve highly reliable results.
机译:缺乏用于中文的开放式话语语料库为许多自然语言处理任务带来了限制。在这项工作中,我们介绍了中文第一个开放的话语TreeBank,即话语中文(DTBC)。在目前的阶段,我们注释了句子内话语联系,他们的相应参数和感官所有890个文件的中国树木银行5.我们开始分析中文话语注释的特征,改编了Penn话语TreeBank 2的注释计划(PDTB2)到汉语,尽可能保持兼容性。根据对中国语言学的先前研究,我们对3个重要方面进行了调整。它们是有感应层次结构,参数范围和参数的语义。协议研究表明,我们的注释计划可以实现高度可靠的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号