首页> 外文会议>20th International Conference on Computational Linguistics vol.2 >Robust Sub-Sentential Alignment of Phrase-Structure Trees
【24h】

Robust Sub-Sentential Alignment of Phrase-Structure Trees

机译:短语结构树的鲁棒亚句对齐

获取原文
获取原文并翻译 | 示例

摘要

Data-Oriented Translation (DOT), based on Data-Oriented Parsing (DOP), is a language-independent MT engine which exploits parsed, aligned bitexts to produce very high quality translations. However, data acquisition constitutes a serious bottleneck as DOT requires parsed sentences aligned at both sentential and sub-structural levels. Manual sub-structural alignment is time-consuming, error-prone and requires considerable knowledge of both source and target languages and how they are related. Automating this process is essential in order to carry out the large-scale translation experiments necessary to assess the full potential of DOT. We present a novel algorithm which automatically induces sub-structural alignments between context-free phrase structure trees in a fast and consistent fashion requiring little or no knowledge of the language pair. We present results from a number of experiments which indicate that our method provides a serious alternative to manual alignment.
机译:基于数据定向分析(DOP)的数据定向翻译(DOT)是一种独立于语言的MT引擎,它利用经过解析的对齐的位扩展来产生非常高质量的翻译。但是,由于DOT需要在句子和子结构级别对齐的已解析句子,因此数据获取构成了严重的瓶颈。手动的子结构对齐非常耗时,容易出错,并且需要对源语言和目标语言以及它们之间的关系有相当的了解。为了执行评估DOT的全部潜力所必需的大规模翻译实验,自动化此过程至关重要。我们提出了一种新颖的算法,该算法以快速一致的方式自动诱导无上下文短语结构树之间的子结构对齐,几乎不需要或完全不需要语言对。我们提供了许多实验结果,这些结果表明我们的方法为手动对齐提供了一种严肃的选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号