首页> 外文会议>Progress in pattern recognition, image analysis, computer vision, and applications >Incorporating Linguistic Information to Statistical Word-Level Alignment
【24h】

Incorporating Linguistic Information to Statistical Word-Level Alignment

机译:将语言信息纳入统计字级对齐

获取原文
获取原文并翻译 | 示例

摘要

Parallel texts are enriched by alignment algorithms, thus establishing a relationship between the structures of the implied languages. Depending on the alignment level, the enrichment can be performed on paragraphs, sentences or words, of the expressed content in the source language and its translation. There are two main approaches to perform word-level alignment: statistical or linguistic. Due to the dissimilar grammar rules the languages have, the statistical algorithms usually give lower precision. That is why the development of this type of algorithms is generally aimed at a specific language pair using linguistic techniques. A hybrid alignment system based on the combination of the two traditional approaches is presented in this paper. It provides user-friendly configuration and is adaptable to the computational environment. The system uses linguistic resources and procedures such as identification of cognates, morphological information, syntactic trees, dictionaries, and semantic domains. We show that the system outperforms existing algorithms.
机译:通过对齐算法可以丰富并行文本,从而在隐含语言的结构之间建立关系。取决于对齐级别,可以对源语言及其翻译中所表达内容的段落,句子或单词进行丰富。执行词级对齐的主要方法有两种:统计的或语言的。由于语言具有不同的语法规则,因此统计算法通常会降低精度。这就是为什么这种类型的算法的开发通常针对使用语言技术的特定语言对的原因。本文提出了一种基于两种传统方法相结合的混合对准系统。它提供用户友好的配置,并适应计算环境。该系统使用语言资源和过程,例如识别同源词,形态信息,句法树,字典和语义域。我们证明了该系统优于现有算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号