Fast Statistical Parsing of Noun Phrases for Document Indexing

机译：用于名词索引的名词短语的快速统计解析

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Information Retrieval (IR) is an important application area of Natural Language Processing (NLP) where one encounters the genuine challenge of processing large quantities of unrestricted natural language text. While much effort has been made to apply NLP techniques to IR, very few NLP techniques have been evaluated on a document collection larger than several megabytes. Many NLP techniques are simply not efficient enough, and not robust enough, to handle a large amount of text. This paper proposes a new probabilistic model for noun phrase parsing, and reports on the application of such a parsing technique to enhance document indexing. The effectiveness of using syntactic phrases provided by the parser to supplement single words for indexing is evaluated with a 250 megabytes document collection. The experiment's results show that supplementing single words with syntactic phrases for indexing consistently and significantly improves retrieval performance.

机译：信息检索（IR）是自然语言处理（NLP）的重要应用领域，人们在其中遇到了处理大量无限制自然语言文本的真正挑战。尽管已经为将NLP技术应用于IR做出了很多努力，但对大于几兆字节的文档集进行的NLP技术评估却很少。许多NLP技术根本不够高效，也不够健壮，无法处理大量文本。本文提出了一种新的名词短语解析概率模型，并报道了这种解析技术在增强文档索引方面的应用。使用250兆字节的文档集合评估了使用解析器提供的语法短语来补充单个单词以进行索引的有效性。实验结果表明，用句法短语补充单个单词以使索引一致并显着提高了检索性能。

著录项

来源
《Fifth conference on applied natural language processing》|1997年|312-319|共8页
会议地点 Washington DC(US);Washington DC(US)
作者
Chengxiang Zhai;
展开▼
作者单位

Laboratory for Computational Linguistics Carnegie Mellon University Pittsburgh, PA 15213;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机软件;
关键词

相似文献

外文文献
中文文献
专利

1. Parsing Noun Phrases in the Penn Treebank [J] . David Vada, James R. Curra Computational linguistics . 2011,第4期

机译：在Penn Treebank中解析名词短语
2. Concept symbols revisited: Naming clusters by parsing and filtering of noun phrases from citation contexts of concept symbols [J] . JESPER W. SCHNEIDER Scientometrics . 2006,第3期

机译：重访概念符号：通过从概念符号的引用上下文中解析和过滤名词短语来命名群集
3. Concept symbols revisited: Naming clusters by parsing and filtering of noun phrases from citation contexts of concept symbols [J] . Jesper W. Schneider Scientometrics . 2006,第3期

机译：重访概念符号：通过从概念符号的引用上下文中解析和过滤名词短语来命名群集
4. Fast Statistical Parsing of Noun Phrases for Document Indexing [C] . Conference on applied natural language processing . 1997

机译：文档索引的名词短语快速统计解析
5. Noun phrases in documents: Preprocessing, automatic extraction, and statistical analysis in different categories of text. [D] . Kim, Youngin. 2002

机译：文档中的名词短语：对不同类别的文本进行预处理，自动提取和统计分析。
6. Improved Identification of Noun Phrases in Clinical Radiology Reports Using a High-Performance Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon [O] . Yang Huang, Henry J. Lowe, Dan Klein, 2005

机译：使用高性能统计自然语言解析器和UMLS专家词典增强了临床放射学报告中名词短语的识别度
7. Fast Statistical Parsing of Noun Phrases for Document Indexing [O] . Zhai, Chengxiang 1997

机译：用于文档索引的名词短语的快速统计分析

Fast Statistical Parsing of Noun Phrases for Document Indexing

摘要

著录项

相似文献

相关主题

期刊订阅