首页> 外文OA文献 >Building a Chinese collocation bank
【2h】

Building a Chinese collocation bank

机译:建立中国托管银行

摘要

This paper presents the design and construction of an annotated Chinese collocation bank as the resource to support systematic research on Chinese collocations. The definition and properties are first studied. Based on a combination of different properties, a classification scheme is proposed to categorize Chinese collocations into four types. With the help of computational tools, bigram collocations and n-gram collocations of 3,643 headwords are manually identified in a 5-million-word corpus. Furthermore, for each identified bigram collocation, its dependency relation, chunking information and classification are annotated to produce a collocation bank. Currently, the Chinese collocation bank contains 23,581 bigram collocations and 2,752 n-gram collocations. The Chinese collocation bank is a valuable resource for Chinese collocation related research. Through statistical analysis on the collocation bank, some interesting characteristics of Chinese bigram collocations are presented in this paper.
机译:本文介绍了带注释的中文搭配银行的设计和构建,作为支持对中文搭配进行系统研究的资源。首先研究其定义和性质。基于不同性质的组合,提出了一种分类方案,将汉语搭配分为四种类型。借助计算工具,可以在500万个单词的语料库中手动识别3,643个headwords的双字母组合和n-gram组合。此外,对于每个识别的二元组搭配,标注其依赖关系,分块信息和分类以产生搭配库。当前,中文搭配库包含23,581个双态搭配和2,752个n-gram搭配。中国搭配银行是与中国搭配相关研究的宝贵资源。通过对搭配库的统计分析,提出了中国双ram搭配的一些有趣特征。

著录项

  • 作者

    Xu R; Lu Q; Wong KF; Li W;

  • 作者单位
  • 年度 2009
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号