首页> 外文会议>9th International conference on language resources and evaluation >The CUHK Discourse TreeBank for Chinese: Annotating Explicit Discourse Connectives for the Chinese TreeBank
【24h】

The CUHK Discourse TreeBank for Chinese: Annotating Explicit Discourse Connectives for the Chinese TreeBank

机译:中大中文话语树库:注释中文树库的显式话语连接词

获取原文

摘要

The lack of open discourse corpus for Chinese brings limitations for many natural language processing tasks. In this work, we present the first open discourse treebank for Chinese, namely, the Discourse Treebank for Chinese (DTBC). At the current stage, we annotated explicit intra-sentence discourse connectives, their corresponding arguments and senses for all 890 documents of the Chinese Treebank 5. We started by analysing the characteristics of discourse annotation for Chinese, adapted the annotation scheme of Penn Discourse Treebank 2 (PDTB2) to Chinese language while maintaining the compatibility as far as possible. We made adjustments to 3 essential aspects according to the previous study of Chinese linguistics. They are sense hierarchy, argument scope and semantics of arguments. Agreement study showed that our annotation scheme could achieve highly reliable results.
机译:中文缺乏开放的话语语料库为许多自然语言处理任务带来了局限性。在这项工作中,我们提出了第一个中文开放式话语树库,即中文话语树库(DTBC)。在当前阶段,我们为汉语树库5的所有890个文档注释了显式的句子内语篇连接词,其对应的论点和意义。我们从分析汉语话语注释的特征开始,改编了宾州话语树库2的注释方案(PDTB2)到中文,同时尽可能保持兼容性。根据先前对汉语语言学的研究,我们对3个基本方面进行了调整。它们是感觉层次,论点范围和论点的语义。一致性研究表明,我们的注释方案可以实现高度可靠的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号