首页> 外文会议>9th International conference on language resources and evaluation >Automatic Refinement of Syntactic Categories in Chinese Word Structures
【24h】

Automatic Refinement of Syntactic Categories in Chinese Word Structures

机译:汉字结构中句法类别的自动改进

获取原文

摘要

Annotated word structures are useful for various Chinese NLP tasks, such as word segmentation, POS tagging and syntactic parsing. Chinese word structures are often represented by binary trees, the nodes of which are labeled with syntactic categories, due to the syntactic nature of Chinese word formation. It is desirable to refine the annotation by labeling nodes of word structure trees with more proper syntactic categories so that the combinatorial properties in the word formation process are better captured. This can lead to improved performances on the tasks that exploit word structure annotations. We propose syntactically inspired algorithms to automatically induce syntactic categories of word structure trees using POS tagged corpus and branching in existing Chinese word structure trees. We evaluate the quality of our annotation by comparing the performances of models based on our annotation and another publicly available annotation, respectively. The results on two variations of Chinese word segmentation task show that using our annotation can lead to significant performance improvements.
机译:注释的字结构对于各种中文NLP任务是有用的,例如Word分段,POS标记和语法解析。由于汉字形成的句法性质,汉语词结构通常由二元树表示,其中节点用句法类别标记。期望通过用更适当的句法类别标记单词结构树的节点来优化注释,从而更好地捕获字形成过程中的组合属性。这可能导致对利用Word结构注释的任务的性能提高。我们提出了语法启发算法,可以使用POS标记的语料库和分支在现有的中文字结构树中自动诱导语法类别。通过根据我们的注释和另一个公开的注释,可以通过比较模型的表演来评估我们注释的质量。结果对汉语单词分割任务的两个变体显示,使用我们的注释可能会导致显着的性能改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号