【24h】

Mining English translation of different types of Chinese OOV

机译:挖掘不同类型中文POV的英文翻译

获取原文
获取原文并翻译 | 示例

摘要

Out of Vocabulary (OOV) is one of the major problems in Machine translation and Cross Language Information Retrieval (CLIR). As time goes on, more Chinese new words appear, these new words and their English translations are not collected in the bilingual dictionary used in the CLIR or machine translation system, so they belong to the OOV name entities. From observation, we know Chinese OOV name entities actually can be categorized into multiple types, so in this paper, we propose and implement an Chinese-English OOV translation mining system, and follow the divide and conquer strategy, we further categorize the Chinese OOV name entities into three types: Foreign word, Chinese name, and Chinese abbreviation, and then deal with them separately. When this system is combined with new word mining system, we can collect new name entities for the bilingual dictionary. From the experiment, we can observe that categorizing the Chinese OOV words helps find translations of OOV words and get a decent result.
机译:词汇不足(OOV)是机器翻译和跨语言信息检索(CLIR)中的主要问题之一。随着时间的流逝,出现了更多的中文新词,这些新词及其英语翻译没有收集到CLIR或机器翻译系统中使用的双语词典中,因此它们属于OOV名称实体。通过观察,我们知道中文OOV名称实体实际上可以分为多种类型,因此在本文中,我们提出并实现了汉英OOV翻译挖掘系统,并遵循分而治之的策略,对中文OOV名称进行了进一步的分类。实体分为三种:外来词,中文名称和中文缩写,然后分别处理。当该系统与新词挖掘系统结合使用时,我们可以为双语词典收集新的名称实体。从实验中我们可以看到,对中文OOV单词进行分类有助于查找OOV单词的翻译并获得不错的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号