首页> 外文会议>Fourth International Conference on Genetic and Evolutionary Computing >Automatic Extraction of Multiword Expressions Combining Statistical and Similarity Approaches
【24h】

Automatic Extraction of Multiword Expressions Combining Statistical and Similarity Approaches

机译:结合统计和相似度方法的多词表达自动提取

获取原文

摘要

Multiword expressions (MWEs) are important for practical applications, such as machine translation (henceforth, MT), multilingual information retrieval, data mining and other natural language processing. A method of combining similarity measure and statistical tool is proposed for automatically extracting English MWEs from the corpus of Chinese government white papers and work reports from 1991 to 2010. Statistical approach is employed to calculate the co-occurrence affinity between two words. Besides, similarity measure is harnessed to compute the semantic relations between words for improving MWE coverage, thus aiming at obtaining higher precision and recall in extracting candidate multiword expressions. Experimental results showed the proposed technique improved MWE extraction efficiently.
机译:多词表达式(MWE)对于实际应用非常重要,例如机器翻译(以下简称MT),多语言信息检索,数据挖掘和其他自然语言处理。提出了一种将相似度量与统计工具相结合的方法,用于从中国政府白皮书和1991年至2010年工作报告的语料库中自动提取英语MWE。采用统计方法来计算两个单词之间的共现亲和力。此外,利用相似性度量来计算单词之间的语义关系,以提高MWE的覆盖率,从而在提取候选多单词表达时获得更高的精度和查全率。实验结果表明,该技术有效地提高了MWE的提取率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号