【24h】

Low-frequency words in bilingual corpora a step towards automatic extraction of bilingual word pairs

机译:双语语料库中的低频单词迈向自动提取双语单词对的一步

获取原文
获取原文并翻译 | 示例
           

摘要

The high-frequency bilingual word pairs in bilingual corpora arc already listed in the dictionaries. It in the low-frequency pairs that we have to extract. Based on that idea, we examine the method for automatically extracting bilingual word pairs from corpora and show that the statistical method, which has been studied intensively so far, is not suitable for the task. If two words J1 and J2 which belong to the same language always co-occur in the same alignments, the statistical method cannot determine which word is the correct translation of word F which belong to the other language. we saw many of the low-frequency words are in the above situation.
机译:词典中已经列出了双语语料库中的高频双语单词对。在我们必须提取的低频对中。基于该思想,我们研究了从语料库中自动提取双语单词对的方法,并表明到目前为止已深入研究的统计方法不适合该任务。如果属于同一语言的两个单词J1和J2总是以相同的对齐方式同时出现,则统计方法无法确定哪个单词是属于另一种语言的单词F的正确翻译。我们看到许多低频词都处于上述情况。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号