首页> 外文会议>情報処理学会全国大会 >Word Alignment Based Bilingual Terminology Extraction from a Chinese-Japanese Parallel Corpus
【24h】

Word Alignment Based Bilingual Terminology Extraction from a Chinese-Japanese Parallel Corpus

机译:基于词对齐的中日平行语料库双语术语提取

获取原文

摘要

The automatic extraction of bilingual single-word terms (SWTs) has been very successful, but for multi-word terms (MWTs), the precision is far from enough. This paper proposes a new approach for the automatic extraction of bilingual MWTs from a bilingual parallel corpus. We combine existing monolingual term extractor and a word alignment tool to extract bilingual terms. We introduced a re-segmentation to process a MWT as a single lexical unit so that it can be treated as a single unit by word alignment. We also improved the extraction rules of MWTs for existing term extractor and the experiment shows that our improvement is valid. We obtained a good precision and an improved BLEU score in our experiment based on a Chinese-Japanese parallel corpus.
机译:双语单词术语(SWT)的自动提取非常成功,但是对于多词术语(MWT),精度远远不够。本文提出了一种从双语并行语料库中自动提取双语MWT的新方法。我们结合了现有的单语术语提取器和单词对齐工具来提取双语术语。我们引入了重新分段,将MWT作为单个词汇单元进行处理,以便可以通过单词对齐将MWT视为单个单元。我们还改进了现有词条提取器的小WT的提取规则,实验表明我们的改进是有效的。我们基于中日平行语料库,在实验中获得了较高的精度和改进的BLEU分数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号