首页> 外文会议>Computational Information Systems >An Algorithm of Solving Interlink Overlapping Ambiguity and Combinatorial Ambiguity and Compound Ambiguity in Chinese Word Segmentation
【24h】

An Algorithm of Solving Interlink Overlapping Ambiguity and Combinatorial Ambiguity and Compound Ambiguity in Chinese Word Segmentation

机译:中文分词中交叠重叠歧义,组合歧义和复合歧义的求解算法

获取原文

摘要

This article chiefly studies the characteristics of overlapping ambiguity and combinatorial ambiguity, defines interlink overlapping ambiguity field and compound ambiguity field, and points out the Bug on solving interlink overlapping ambiguity, combinatorial ambiguity and compound ambiguity field in Chinese word segmentation of the present several kinds of mature Chinese automatic word segmentation system, for example: the Institute of Computing Technology Chinese Lexical Analysis System(ICTCLAS), the HLSplitWord intelligent system, Institute of Computational Linguistics (ICL) Peking University online Chinese word segmentation system. In order to solve combinatorial ambiguity, this article corrects the semantic relevancy through contextual information based on Hownet, redefines the fee about the candidate word, and amends the algorithm of maximum probability word segmentation, solves successfully the questions about overlapping ambiguity, combinatorial ambiguity, interlink overlapping ambiguity and compound ambiguity in Chinese word segmentation. Finally, this article points out the calculation method of the semantic relevancy about the two words must be improved in a sentence.
机译:本文主要研究重叠歧义和组合歧义的特点,定义了链接重叠歧义字段和复合歧义字段,指出了目前几种汉语分词中解决链接重叠歧义,组合歧义和复合歧义字段的Bug。成熟的中文自动分词系统,例如:计算技术学院中文词法分析系统(ICTCLAS),HLSplitWord智能系统,北京大学计算语言学研究所(ICL)在线中文分词系统。为了解决组合歧义,本文通过基于Hownet的上下文信息对语义相关性进行了修正,重新定义了候选词的费用,并修正了最大概率词分割算法,成功解决了重叠歧义,组合歧义,链接的问题。汉语分词中的重叠歧义和复合歧义。最后,本文指出句子中两个单词的语义相关性的计算方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号