首页> 外文会议>Proceedings of the workshop on Student research >Efficient parsing strategies for syntactic analysis of closed captions
【24h】

Efficient parsing strategies for syntactic analysis of closed captions

机译:用于隐藏字幕的句法分析的有效解析策略

获取原文
获取原文并翻译 | 示例

摘要

We present an efficient multi-level chart parser that was designed for syntactic analysis of closed captions (subtitles) in a real-time Machine Translation (MT) system. In order to achieve high parsing speed, we divided an existing English grammar into multiple levels. The parser proceeds in stages. At each stage, rules corresponding to only one level are used. A constituent pruning step is added between levels to insure that constituents not likely to be part of the final parse are removed. This results in a significant parse time and ambiguity reduction. Since the domain is unrestricted, out-of-coverage sentences are to be expected and the parser might not produce a single analysis spanning the whole input. Despite the incomplete parsing strategy and the radical pruning, the initial evaluation results show that the loss of parsing accuracy is acceptable. The parsing time favorable compares with a Tomita parser and a chart parser parsing time when run on the same grammar and lexicon.
机译:我们提出了一种高效的多级图表解析器,该解析器旨在用于实时机器翻译(MT)系统中的隐藏式字幕(字幕)的语法分析。为了达到较高的解析速度,我们将现有的英语语法分为多个级别。解析器分阶段进行。在每个阶段,仅使用对应于一个级别的规则。在级别之间添加成分修剪步骤,以确保删除不太可能成为最终解析一部分的成分。这导致显着的解析时间和歧义减少。由于域是不受限制的,因此可以预料不到覆盖范围的句子,并且解析器可能不会对整个输入进行单个分析。尽管解析策略不完整且进行了彻底的修剪,但初步评估结果表明,解析精度的损失是可以接受的。在相同的语法和词典上运行时,解析时间与Tomita解析器和图表解析器的解析时间相比是比较有利的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号