首页> 外文学位 >Complexity of lexical descriptions and its relevance to partial parsing.
【24h】

Complexity of lexical descriptions and its relevance to partial parsing.

机译:词汇描述的复杂性及其与部分解析的关系。

获取原文
获取原文并翻译 | 示例

摘要

In this dissertation, we have proposed novel methods for robust parsing that integrate the flexibility of linguistically motivated lexical descriptions with the robustness of statistical techniques. Our thesis is that the computation of linguistic structure can be localized if lexical items are associated with rich descriptions (supertags) that impose complex constraints in a local context. However, increasing the complexity of descriptions makes the number of different descriptions for each lexical item much larger and hence increases the local ambiguity for a parser. This local ambiguity can be resolved by using supertag co-occurrence statistics collected from parsed corpora. We have explored these ideas in the context of Lexicalized Tree-Adjoining Grammar (LTAG) framework wherein supertag disambiguation provides a representation that is an almost parse. We have used the disambiguated supertag sequence in conjunction with a lightweight dependency analyzer to compute noun groups, verb groups, dependency linkages and even partial parses. We have shown that a trigram-based supertagger achieves an accuracy of 92.1% on Wall Street Journal (WSJ) texts. Furthermore, we have shown that the lightweight dependency analysis on the output of the supertagger identifies 83% of the dependency links accurately. We have exploited the representation of supertags with Explanation-Based Learning to improve parsing efficiency. In this approach, parsing in limited domains can be modeled as a Finite-State Transduction. We have implemented such a system for the ATIS domain which improves parsing efficiency by a factor of 15. We have used the supertagger in a variety of applications to provide lexical descriptions at an appropriate granularity. In an information retrieval application, we show that the supertag based system performs at higher levels of precision compared to a system based on part-of-speech tags. In an information extraction task, supertags are used in specifying extraction patterns. For language modeling applications, we view supertags as syntactically motivated class labels in a class-based language model. The distinction between recursive and non-recursive supertags is exploited in a sentence simplification application.
机译:在本文中,我们提出了一种新颖的鲁棒分析方法,该方法将语言动机的词汇描述的灵活性与统计技术的鲁棒性相结合。我们的论点是,如果词汇项与丰富的描述(超级标记)相关联,那么语言结构的计算就可以被本地化,而丰富的描述(超级标签)则在本地上下文中施加了复杂的约束。但是,增加描述的复杂性会使每个词汇项的不同描述的数量大大增加,因此增加了解析器的局部歧义。可以通过使用从已分析的语料库收集的超级标签共现统计信息来解决此局部歧义。我们已经在词法化树状连接语法(LTAG)框架中探索了这些想法,其中超级标记消除歧义提供了几乎是解析的表示形式。我们已经将歧义的超标签序列与轻量级的依赖分析器结合使用来计算名词组,动词组,依赖关系甚至部分解析。我们已经显示,基于三字母组的超级标语在《华尔街日报》(WSJ)文本上达到92.1%的准确性。此外,我们已经表明,对超级标记的输出进行轻量级的依赖关系分析可以准确地识别83%的依赖关系链接。我们已经利用基于解释的学习来利用超标签的表示来提高解析效率。在这种方法中,可以将有限域中的解析建模为有限状态转换。我们已经为ATIS域实现了这样一个系统,该系统将解析效率提高了15倍。我们已经在各种应用程序中使用了超级标记,以适当的粒度提供了词法描述。在信息检索应用程序中,我们表明与基于词性标签的系统相比,基于超级标签的系统具有更高的精度。在信息提取任务中,超级标签用于指定提取模式。对于语言建模应用程序,我们将超级标记视为基于类的语言模型中出于语法动机的类标签。在句子简化应用程序中利用了递归和非递归超级标签之间的区别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号