Natural language parsing as statistical pattern recognition.

机译：自然语言解析作为统计模式识别。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Traditional natural language parsers are based on rewrite rule systems developed in an arduous, time-consuming manner by grammarians. A majority of the grammarian's efforts are devoted to the disambiguation process, first hypothesizing rules which dictate constituent categories and relationships among words in ambiguous sentences, and then seeking exceptions and corrections to these rules.; In this work, I propose an automatic method for acquiring a statistical parser from a set of parsed sentences which takes advantage of some initial linguistic input, but avoids the pitfalls of the iterative and seemingly endless grammar development process. Based on distributionally-derived and linguistically-based features of language, this parser acquires a set of statistical decision trees which assign a probability distribution on the space of parse trees given the input sentence. These decision trees take advantage of significant amount of contextual information, potentially including all of the lexical information in the sentence, to produce highly accurate statistical models of the disambiguation process. By basing the disambiguation criteria selection on entropy reduction rather than human intuition, this parser development method is able to consider more sentences than a human grammarian can when making individual disambiguation rules.; In experiments between a parser, acquired using this statistical framework, and a grammarian's rule-based parser, developed over a ten-year period, both using the same training material and test sentences, the decision tree parser significantly outperformed the grammar-based parser on the accuracy measure which the grammarian was trying to maximize, achieving an accuracy of 78% compared to the grammar-based parser's 69%.

机译：传统的自然语言解析器基于语法专家以艰巨，费时的方式开发的重写规则系统。语法学家的大部分工作致力于消歧过程，首先假设规则，该规则规定歧义句子中单词的构成类别和关系，然后寻求对这些规则的例外和更正。在这项工作中，我提出了一种自动方法，该方法用于从一组已解析的句子中获取统计解析器，该方法利用了一些初始语言输入，但避免了语法迭代和看似无止境的语法开发过程的陷阱。基于语言的分布派生和基于语言的特征，此解析器获取一组统计决策树，这些统计决策树在给定输入句子的情况下在解析树的空间上分配概率分布。这些决策树利用大量上下文信息（可能包括句子中的所有词汇信息）来生成歧义消除过程的高度准确的统计模型。通过将消歧标准选择基于熵减少而不是人类的直觉，这种解析器开发方法在制定个体消歧规则时能够比人类语法学家考虑更多的句子。在使用该统计框架获取的解析器与使用语法和语法测试的语法学家（历时十年）之间进行的实验中，使用相同的培训材料和测试语句，决策树解析器在性能上明显优于基于语法的解析器。语法专家试图最大化的准确性度量，与基于语法的解析器的69％相比，实现了78％的准确性。

著录项

作者
Magerman, David Mitchell.;
展开▼
作者单位

Stanford University.;

展开▼
授予单位 Stanford University.;
学科 Computer Science.; Statistics.; Language Linguistics.
学位 Ph.D.
年度 1994
页码 158 p.
总页数 158
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;统计学;语言学;
关键词

相似文献

外文文献
中文文献
专利

1. On the relation between dependency distance, crossing dependencies, and parsing Comment on "Dependency distance: a new perspective on syntactic patterns in natural languages" by Haitao Liu et al. [J] . Gomez-Rodriguez Carlos Physics of life reviews . 2017,第期

机译：海涛刘等人的依赖距离，交叉依赖关系与解析评论的关系。
2. Lexicalized and Statistical Parsing of Natural Language Text in Tamil using Hybrid Language Models [J] . M. SELVAM, A. M. NATARAJAN, R. THANGARAJAN WSEAS Transactions on Computers . 2008,第8期

机译：使用混合语言模型对泰米尔语中的自然语言文本进行词汇化和统计分析
3. Statistical Grammar Induction for Natural Language Parsing [J] . Hiroyuki Shindo NTT Technical Review . 2014,第1期

机译：用于自然语言解析的统计语法归纳
4. Can Modern Statistical Parsers Lead to Better Natural Language Understanding for Education? [C] . Umair Z. Ahmed, Arpit Kumar, Monojit Choudhury, International conference on intelligent text processing and computational linguistics;CICLing 2012 . 2012

机译：现代统计分析器能否为教育带来更好的自然语言理解？
5. Learning for semantic parsing and natural language generation using statistical machine translation techniques. [D] . Wong, Yuk Wah. 2007

机译：使用统计机器翻译技术学习语义解析和自然语言生成。
6. Using a Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon to Assign SNOMED CT Codes to Anatomic Sites and Pathologic Diagnoses in Full Text Pathology Reports [O] . Henry J. Lowe, Yang Huang, Donald P. Regula 2009

机译：使用带有UMLS专家词典增强的统计自然语言解析器为全文病理报告中的解剖部位和病理诊断分配SNOMED CT代码
7. Natural Language Parsing as Statistical Pattern Recognition [O] . Magerman, David M. 1994

机译：自然语言解析作为统计模式识别
8. Natural Language Parsing as Statistical Pattern Recognition [R] . Magerman, D. M. 1994

机译：自然语言解析作为统计模式识别

Natural language parsing as statistical pattern recognition.

摘要

著录项

相似文献

相关主题

期刊订阅