...
首页> 外文期刊>Journal of the Chinese Institute of Engineers >Lexical analysis for chinese- difficul ties and possible solutionsb
【24h】

Lexical analysis for chinese- difficul ties and possible solutionsb

机译:针对汉语的词法分析-困难和可能的解决方案

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Chinese sentences are composed with strings of characters without blanks to mark word boundaries. However, the basic processing unit for sentence processing is the word. It is the smallest meaningful, freely used unit for any natural language. Therefore lexical analysis became the first step in processing Chinese sentences. Usually a lexicon is utilized to match words and provide their synactic and semantic information in the process of lexical analysis. During the word matching process, problems of segmentation ambiguity and occurrences of unknown words will occur. In this paper, both statistical methods and rule-based methods are discussed for their advantages and disadvantages in solving segmentation ambiguities. For unknown word identification, off-line word extraction methods and on-line unknown word identification strategies are surveyed. Both methods complement each other in solving the problem. The strategies and knowledge sources for implementing a practical system are also discussed.
机译:中文句子由字符串组成,没有空格来标记单词边界。但是,用于句子处理的基本处理单元是单词。它是所有自然语言中最小的有意义的,可自由使用的单元。因此,词法分析成为处理中文句子的第一步。通常在词汇分析过程中使用词典来匹配单词并提供其句法和语义信息。在单词匹配过程中,会出现分段歧义和出现未知单词的问题。本文讨论了统计方法和基于规则的方法在解决分割歧义性方面的优缺点。对于未知单词识别,研究了离线单词提取方法和在线未知单词识别策略。两种方法在解决问题上相互补充。还讨论了实施实际系统的策略和知识来源。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号