首页> 外文会议>International conference on machine translation amp; computer language information processing >Study of Segmentation Strategy on Ambiguous Phrases of Overlap Type
【24h】

Study of Segmentation Strategy on Ambiguous Phrases of Overlap Type

机译:重叠型歧义短语的切分策略研究

获取原文
获取原文并翻译 | 示例

摘要

Ambiguity segmentation is still an open issue in the study of Chinese word segmentation. In corpus, most of the ambiguous phrases of overlap type are pseudo, which can be correctly segmented with only the information within the phrases. While the segmentation for true ambiguity needs grammatical or even semantic information. In this paper, we present a segmentatio strategy on ambiguous phrases of voerlap type with rules based on the independent-wording-ability frequency and collocating of words & parts of speech, which improves the accuracy of segmentation greatly. In an open test of a Chinese corpus with 39,000 characters, the accuracy of segmentation for ambiguous phraes of overlap reached 98
机译:在汉语分词研究中,歧义分割仍然是一个未解决的问题。在语料库中,大多数重叠类型的歧义短语都是伪的,可以仅用短语中的信息正确地对其进行分段。而真正歧义的分割需要语法甚至语义信息。在本文中,我们提出了一种基于独立词能力频率和词性与词性搭配的规则对voerlap型歧义短语进行切分的策略,极大地提高了切分的准确性。在一个具有39,000个字符的中文语料库的开放测试中,重叠不明确短语的分割精度达到98

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号