首页> 外文会议>IEEE International Conference on Natural Language Processing and Knowledge Engineering(IEEE NLP-KE'05); 20051030-1101; Wuhan(CN) >Hierarchical Iterative and Self-Supervised Method for Concept-Word Acquisition from Large-Scale Chinese Corpora
【24h】

Hierarchical Iterative and Self-Supervised Method for Concept-Word Acquisition from Large-Scale Chinese Corpora

机译:大规模汉语语料库概念词习得的分层迭代和自监督方法

获取原文
获取原文并翻译 | 示例

摘要

This paper proposes a hierarchical iterative and self-supervised method (HISS) to acquire concept words from a large-scale, un-segmented Chinese corpus. It has two levels of iteration: the EM-CLS algorithm and the Viterbi-C/S algorithm constitute the inner iteration for generating concept words, and the concept word validation constitutes the outer iteration together with the concept word generation. Through multiple iterations, it integrates the concept word generation and validation into a uniform acquisition process. In the process of acquisition, the HISS method can cope with the problem of over-segmentation, over-combination and data sparseness. The experimental result shows that the HISS method is valid for concept word acquisition that can simultaneously increase the precision and recall rate of concept word acquisition.
机译:本文提出了一种层次化的迭代和自我监督的方法(HISS)来从大规模的,非分段的汉语语料库中获取概念词。它具有两个迭代级别:EM-CLS算法和Viterbi-C / S算法构成用于生成概念词的内部迭代,概念词验证与概念词生成一起构成外部迭代。通过多次迭代,它将概念词的生成和验证集成到统一的获取过程中。在获取过程中,HISS方法可以解决过度分割,过度组合和数据稀疏的问题。实验结果表明,HISS方法对于概念词的获取是有效的,可以同时提高概念词获取的准确性和查全率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号