首页> 外文会议>Proceedings of international conference on natural language processing and knowledge engineering >Extracting Historical Terms Based on Aligned Chinese-English Parallel Corpora
【24h】

Extracting Historical Terms Based on Aligned Chinese-English Parallel Corpora

机译:基于对齐的汉英平行语料库提取历史术语

获取原文

摘要

This paper examines the feasibility of implementing statistic-oriented term extraction and evaluation methods in extracting historical terms from aligned parallel corpora of Chinese historical classics and their translations. It proposes to take transliteration as anchor points to establish sentence-level alignment. It also investigates the approach to extract term translation pairs based on 4000 parallel sentences or segments of sentences from the corpora of the Chinese historical classic Shi Ji (Records of the Historian) and its English translations by two well-known translators. The experimental results indicate that the statistically sound algorithm can successfully extract those terms whose English translations are consistent throughout the corpus and those transliterated pairs, but fails to extract the translations of those terms that are translated differently by the two translators although the translations may be equally qualified in terms of their usage in the English language. The algorithm also fails to extract the top frequency terms which are ambiguous in meaning due to changes of its part of speech. Therefore, this paper suggests insights gained from the linguistic and translation studies perspectives can be integrated with the statistic measurements to improve the extraction and validating results.
机译:本文探讨了在从中国历史经典及其翻译的平行平行语料库中提取历史术语时,采用面向统计的术语提取和评估方法的可行性。它建议以音译为锚点来建立句子级别的对齐方式。它还研究了从两位中国著名译者的中国历史经典《史记》和其英语翻译的语料库中基于4000个平行句子或句子片段提取术语翻译对的方法。实验结果表明,统计上合理的算法可以成功地提取出在整个语料库和音译对中其英语翻译是一致的那些术语,但是尽管两个词的翻译可能相等,但未能提取出由两个翻译者不同翻译的那些术语的翻译在英语用法方面合格。该算法还无法提取由于其词性变化而导致含义不明确的最高频率项。因此,本文建议从语言和翻译研究的角度获得的见解可以与统计量度相结合,以改善提取和验证结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号