首页> 外文会议>2010 International Forum on Information Technology and Applications >A Pragmatic Approach to Increase Accuracy of Chinese Word-Segmentation
【24h】

A Pragmatic Approach to Increase Accuracy of Chinese Word-Segmentation

机译:一种提高汉语分词准确性的语用方法

获取原文

摘要

Chinese word segmentation is important for understanding and dealing with Chinese natural language, and it is also a important part of search engineer, text retrieval, speech recognition, automatic translation. Chinese word segmentation is challenging because there is no space or physical means to mark the boundaries of words. It is often difficult to define what constitutes a word in Chinese. Currently, we have not yet fully mature and practical-oriented available Chinese word segmentation system, especially in the word-segmentation accuracy. This article presents a pragmatic approach to Chinese word segmentation to increase the word-segmentation accuracy. We introduce the combining mechanism of hybrid dictionary and universal dictionary, we design the practical data structure and describe this word segmentation algorithm, and give the test results.
机译:中文分词对理解和处理中文自然语言很重要,它也是搜索工程师,文本检索,语音识别,自动翻译的重要组成部分。中文分词具有挑战性,因为没有空间或物理手段来标记单词的边界。定义中文单词的构成通常很困难。目前,我们还没有完全成熟和面向实际的可用中文分词系统,尤其是在分词准确性方面。本文提出了一种实用的中文分词方法,以提高分词的准确性。介绍了混合字典和通用字典的结合机制,设计了实用的数据结构并描述了该分词算法,并给出了测试结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号