首页> 外文会议>Pacific Asia Conference on Language, Information and Computation; 20061101-03; Wuhan(CN) >A Comparative Study of the Effect of Word Segmentation On Chinese Terminology Extraction
【24h】

A Comparative Study of the Effect of Word Segmentation On Chinese Terminology Extraction

机译:分词对中文术语提取效果的比较研究

获取原文
获取原文并翻译 | 示例

摘要

Automatic term extraction is the first step towards automatic or semi-automatic update of existing domain knowledge base. Most of the researches applied word segmentation as a preprocessing step to Chinese term extraction. However, segmentation ambiguity is unavoidable, especially in identifying unknown words for Chinese. In this paper, we discuss the effect and limitations of segmentation to Chinese terminology extraction. Detailed study shows that propagated errors caused by word segmentation have great impact on the result of terminology extraction. Based on our analysis and experiments, it is proven that character-based terminology extraction yields much better result than that using segmentation as a preprocessing step.
机译:自动术语提取是朝着现有领域知识库自动或半自动更新的第一步。大部分研究将分词作为汉语术语提取的预处理步骤。但是,切分歧义是不可避免的,尤其是在识别中文的未知词时。在本文中,我们讨论了分割对中文术语提取的影响和局限性。详细的研究表明,由分词引起的传播错误对术语提取的结果有很大的影响。根据我们的分析和实验,证明基于字符的术语提取比使用分段作为预处理步骤产生的结果要好得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号