首页> 外文期刊>Computational linguistics >A Joint Model to Identify and Align Bilingual Named Entities
【24h】

A Joint Model to Identify and Align Bilingual Named Entities

机译:识别和对齐双语命名实体的联合模型

获取原文
       

摘要

In this article, an integrated model is derived that jointly identifies and aligns bilingual named entities (NEs) between Chinese and English. The model is motivated by the following observations: (1) whether an NE is translated semantically or phonetically depends greatly on its entity type, (2) entities within an aligned pair should share the same type, and (3) the initially detected NEs can act as anchors and provide further information while selecting NE candidates. Based on these observations, this article proposes a translation mode ratio feature (defined as the proportion of NE internal tokens that are semantically translated), enforces an entity type consistency constraint, and utilizes additional new NE likelihoods (based on the initially detected NE anchors).Experiments show that this novel method significantly outperforms the baseline. The type-insensitive F-score of identified NE pairs increases from 78.4% to 88.0% (12.2% relative improvement) in our Chinese–English NE alignment task, and the type-sensitive F-score increases from 68.4% to 83.0% (21.3% relative improvement). Furthermore, the proposed model demonstrates its robustness when it is tested across different domains. Finally, when semi-supervised learning is conducted to train the adopted English NE recognition model, the proposed model also significantly boosts the English NE recognition type-sensitive F-score.
机译:在本文中,将导出一个集成模型,该模型可以共同标识和对齐中英文之间的双语命名实体(NE)。该模型受到以下观察结果的激励:(1)NE是在语义上还是语音上进行转换的,很大程度上取决于其实体类型;(2)对齐对中的实体应该共享相同的类型;(3)最初检测到的NE可以在选择NE候选者时充当锚点并提供更多信息。基于这些观察,本文提出了一种转换模式比率功能(定义为语义转换的网元内部令牌的比例),强制实施实体类型一致性约束,并利用其他新的网元可能性(基于最初检测到的网元锚点)实验表明,这种新颖的方法明显优于基线。在汉英网元对齐任务中,已识别的网元对的类型不敏感F分数从78.4%增加到88.0%(相对改善12.2%),类型敏感F分数从68.4%增加到83.0%(21.3相对改善百分比)。此外,所提出的模型在跨不同领域进行测试时证明了其鲁棒性。最后,当进行半监督学习来训练所​​采用的英语NE识别模型时,所提出的模型也大大提高了英语NE识别类型敏感的F分数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号