...
首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Entity Translation Mining from Comparable Corpora: Combining Graph Mapping with Corpus Latent Features
【24h】

Entity Translation Mining from Comparable Corpora: Combining Graph Mapping with Corpus Latent Features

机译:可比语料库中的实体翻译挖掘:将图映射与语料库潜在特征相结合

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

This paper addresses the problem of mining named entity translations from comparable corpora, specifically, mining English and Chinese named entity translation. We first observe that existing approaches use one or more of the following named entity similarity metrics: entity, entity context, and relationship. Motivated by this observation, we propose a new holistic approach by 1) combining all similarity types used and 2) additionally considering relationship context similarity between pairs of named entities, a missing quadrant in the taxonomy of similarity metrics. We abstract the named entity translation problem as the matching of two named entity graphs extracted from the comparable corpora. Specifically, named entity graphs are first constructed from comparable corpora to extract relationship between named entities. Entity similarity and entity context similarity are then calculated from every pair of bilingual named entities. A reinforcing method is utilized to reflect relationship similarity and relationship context similarity between named entities. We also discover "latent" features lost in the graph extraction process and integrate this into our framework. According to our experimental results, our holistic graph-based approach and its enhancement using corpus latent features are highly effective and our framework significantly outperforms previous approaches.
机译:本文解决了从可比语料库中挖掘命名实体翻译的问题,特别是挖掘英语和汉语命名实体翻译。我们首先观察到现有方法使用以下一个或多个命名实体相似性度量:实体,实体上下文和关系。出于这一观察,我们提出了一种新的整体方法,该方法是:1)组合使用的所有相似性类型,以及2)另外考虑成对实体之间的关系上下文相似性,这是相似性度量分类法中缺失的象限。我们将命名实体翻译问题抽象为从可比语料库中提取的两个命名实体图的匹配。具体来说,首先从可比语料库构建命名实体图,以提取命名实体之间的关系。然后从每对双语命名实体中计算出实体相似度和实体上下文相似度。利用一种增强方法来反映命名实体之间的关系相似性和关系上下文相似性。我们还发现图提取过程中丢失的“潜在”功能,并将其集成到我们的框架中。根据我们的实验结果,基于整体图的方法及其使用语料库潜在特征的增强非常有效,并且我们的框架明显优于以前的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号