首页> 外文会议>Conference on Intelligent Text Processing and Computational Linguistics >Building a Bilingual Dictionary from a Japanese-Chinese Patent Corpus
【24h】

Building a Bilingual Dictionary from a Japanese-Chinese Patent Corpus

机译:从日本 - 中国专利语料库中建立双语词典

获取原文

摘要

In this paper, we propose an automatic method to build a bilingual dictionary from a Japanese-Chinese parallel corpus. The proposed method uses character similarity between Japanese and Chinese, and a statistical machine translation (SMT) framework in a cascading manner. The first step extracts word translation pairs from the parallel corpus based on similarity between Japanese kanji characters (Chinese characters used in Japanese writing) and simplified Chinese characters. The second step trains phrase tables using 2 different SMT training tools, then extracts common word translation pairs. The third step trains an SMT system using the word translation pairs obtained by the first and the second steps. According to the experimental results, the proposed method yields 59.3% to 92.1% accuracy in the word translation pairs extracted, depending on the cascading step.
机译:在本文中,我们提出了一种自动方法来构建日语和日语并联语料库中的双语词典。该方法使用日语和中文之间的性格相似,以及级联方式的统计机器翻译(SMT)框架。第一步从并行语料库中提取单词转换对基于日语Kanji字符之间的相似性(日语写作中的汉字)和简体中文字符。第二步列车用2个不同的SMT训练工具短语表,然后提取公共词转换对。第三步使用由第一和第二步骤获得的字转换对进行SMT系统。根据实验结果,提出的方法在提取的转换对中提取的精度为59.3%至92.1%,取决于级联步骤。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号