Error-Driven Pruning of Treebank Grammars for Base Noun Phrase Identification

机译：误差驱动的树库语法修剪，用于基本名词短语识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Finding simple, non-recursive, base noun phrases is an important subtask for many natural language processing applications. While previous empirical methods for base NP identification have been rather complex, this paper instead proposes a very simple algorithm that is tailored to the relative simplicity of the task. In particular, we present a corpus-based approach for finding base NPs by matching part-ofspeech tag sequences. The training phase of the algorithm is based on two successful techniques: first the base NP grammar is read from a "treebank" corpus; then the grammar is improved by selecting rules with high "benefit" scores. Using this simple algorithm with a naive heouristic for matching rules, we achieve surprising accuracy in an evaluation on the Penn Treebank Wall Street Journal.

机译：对于许多自然语言处理应用程序而言，找到简单的，非递归的基础名词短语是重要的子任务。尽管先前的用于基础NP识别的经验方法相当复杂，但本文提出了一种非常简单的算法，该算法针对任务的相对简单性而量身定制。特别是，我们提出了一种基于语料库的方法，通过匹配词性标记序列来查找基础NP。该算法的训练阶段基于两种成功的技术：首先从“树库”语料库中读取基本的NP语法;第二，从“树库”语料库中读取基本的NP语法。然后通过选择具有较高“受益”分数的规则来改进语法。使用这种简单的算法和朴素的启发式算法来匹配规则，我们在《宾夕法尼亚州树银行华尔街日报》的评估中获得了令人惊讶的准确性。

著录项

来源
《Annual meeting of the Association for Computational Linguistics;International conference on computational linguistics;ICCL 》|1998年|p.218-224|共7页
会议地点
作者
Claire Cardie; David Pierce;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术 ;
关键词

相似文献

外文文献
中文文献
专利

1. Highly accurate error-driven method for noun phrase detection [J] . Lourdes Araujo, J. Ignacio Serrano Pattern recognition letters . 2008 ,第4期

机译：高精度的误差驱动名词短语检测方法
2. Parsing Noun Phrases in the Penn Treebank [J] . David Vada, James R. Curra Computational linguistics . 2011 ,第4期

机译：在Penn Treebank中解析名词短语
3. Dependency grammar feature based noun phrase extraction for text summarization. [J] . Mrs. Dipti Sakhare, Dr. Rajkumar International Journal of Computer Trends and Technology . 2011 ,第1期

机译：基于依存语法特征的名词短语提取，用于文本摘要。
4. Error-Driven Pruning of Treebank Grammars for Base Noun Phrase Identification [C] . Claire Cardie, David Pierce Annual meeting of the Association for Computational Linguistics . 1998

机译：基于基础名词短语识别的TreeBank语法的错误激励
5. Khmer nouns and noun phrases: A dependency grammar analysis [D] . Sak-Humphry, Chhany. 1996

机译：Khmer名词和名词短语：依赖语法分析
6. Improved Identification of Noun Phrases in Clinical Radiology Reports Using a High-Performance Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon [O] . Yang Huang, Henry J. Lowe, Dan Klein, 2005

机译：使用高性能统计自然语言解析器和UMLS专家词典增强了临床放射学报告中名词短语的识别度
7. Error-Driven Pruning of Treebank Grammars for Base Noun Phrase Identification [O] . 2008

机译：基于名词短语识别的树库语法的错误驱动修剪
8. Computational Structure of GPSG (Generalized Phrase Structure Grammar) Models: Revised Generalized Phrase Structure Grammar. [R] . Ristad, E. S. 1989

机译：GpsG（广义短语结构语法）模型的计算结构：修正的广义短语结构语法。

Error-Driven Pruning of Treebank Grammars for Base Noun Phrase Identification

摘要

著录项

相似文献

相关主题

期刊订阅