首页> 外文学位 >Modular stochastic HPSGs for question answering.
【24h】

Modular stochastic HPSGs for question answering.

机译:用于回答问题的模块化随机HPSG。

获取原文
获取原文并翻译 | 示例

摘要

At the highest level, we explore the practical problem of grammar modularity in natural language processing (NLP). Two aspects of the problem are modular design and modular use of NL grammars. We define grammar modules and describe the operation of merging two grammar modules into a larger module, and extraction of a subgrammar module from a larger module given an application context, e.g., a text and type of needed information. Grammar modularity can be applied to various domains, especially in distributed NLP—a synergetic area of the Internet and NLP techniques.; For the formal materialization of this higher-level approach we use HPSG—Head-driven Phrase Structure Grammar formalism. We define the formalism in a concise way, which is more amenable to implementations in procedural programming languages than the previous approaches. We define grammar modules and module merging in the context of this formalism, and present and analyze algorithms for subgrammar extraction for context-free and HPSG grammars.; On the practical side, we use the problem of open-domain question answering to illustrate the use and usefulness of the approach. The question-answering framework of the well-known TREC conference is used: The task is to find a short answer to a NL question as a substring of a document from the given document collection. We show that our novel approach can be successfully used with a classical information retrieval search engine.; We describe an implementation of the HPSG parser in Java. Motivated by the recent successes of probabilistic parsers, a stochastic component of the HPSG formalism is defined and implemented in the parser. The parser uses known techniques for efficient graph unification and parsing, such as hidden structure sharing.; The rest of our QA system is implemented in Perl. It includes the parts for managing grammar modules and for subgrammar extraction.; is described. The advantages of this contribution include a more compact memory representation, efficient memory management within the algorithm, sub-node hidden structure sharing, and flat structure without frequent function calls. A Java and a C implementation of the algorithm are given in appendices.
机译:在最高级别,我们探讨了自然语言处理( NLP )中语法模块化的实际问题。问题的两个方面是模块化设计和NL语法的模块化使用。我们定义语法模块并描述将两个语法模块合并为一个更大的模块,并在给定应用程序上下文(例如文本和所需信息的类型)的情况下从更大的模块中提取子语法模块的操作。语法模块化可以应用于各个领域,尤其是在分布式NLP中-Internet和NLP技术的协同领域。为了正式实现这种高级方法,我们使用了HPSG-头部驱动的短语结构语法形式。我们以简洁的方式定义形式主义,与以前的方法相比,它更适合过程编程语言中的实现。我们在此形式主义的上下文中定义语法模块和模块合并,并介绍和分析用于上下文无关和HPSG语法的子语法提取算法。在实践方面,我们使用开放域问答的问题来说明该方法的使用和实用性。使用了著名的TREC会议的问答框架:该任务是从给定的文档集中找到NL问题的简短答案作为文档的子字符串。我们证明了我们的新颖方法可以与经典的信息检索搜索引擎一起成功使用。我们描述了Java中HPSG解析器的实现。受概率解析器最近成功的推动,在解析器中定义并实现了HPSG形式主义的随机组成部分。解析器使用已知技术进行有效的图形统一和解析,例如隐藏结构共享。我们的其他质量检查系统是在Perl中实现的。它包括用于管理语法模块和子语法提取的部分。描述。这种贡献的优点包括更紧凑的内存表示,算法内有效的内存管理,子节点隐藏结构共享以及无需频繁调用函数的平面结构。附录中给出了该算法的Java和C实现。

著录项

  • 作者

    Keselj, Vlado.;

  • 作者单位

    University of Waterloo (Canada).;

  • 授予单位 University of Waterloo (Canada).;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2002
  • 页码 320 p.
  • 总页数 320
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号