首页> 外文会议>International Conference on Intelligent Transportation, Big Data and Smart City >Knowledge Acquisition Method for Large-Scale Bilingual Corpus Based on Web Mining
【24h】

Knowledge Acquisition Method for Large-Scale Bilingual Corpus Based on Web Mining

机译:基于网挖掘的大型双语语料库知识获取方法

获取原文

摘要

This paper describes a method to acquire multi-word translational equivalences from English-Chinese parallel corpora based on Web mining. To solve the correspondence problem of multiple word, N-gram model is adopted to extract candidate translate units. Then the co-occurrence information is used to acquire subject words related to resource proper noun from search engine. The subject terms translation is adopted to perform language-crossed extension, and the extended query will obtain bilingual abstract resources with high quality from the search engine. We also extract the candidate translate units such as compound words and phrases, based on frequency change information and adjacency information, and make final selection of proper nouns integrated transliteration features, statistical features and template features. The experiments show that the translation mining method proposed in this paper has good performance.
机译:本文介绍了一种根据网上挖掘从英汉平行语料中获取多字翻译等效的方法。为了解决多个单词的对应问题,采用n-gram模型提取候选转换单元。然后,共发生信息用于从搜索引擎获取与资源专有名词相关的主题词。采用主题术语翻译来执行语言交叉的扩展,扩展查询将获得具有从搜索引擎的高质量的双语抽象资源。我们还基于频率变化信息和邻接信息提取候选转换单位,例如复合词和短语,并进行最终选择适当的名词集成的音译功能,统计功能和模板特征。实验表明,本文提出的翻译挖掘方法具有良好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号