Statistical Identification of Key Phrases for Text Classification

机译：文本分类关键短语的统计识别

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Algorithms for text classification generally involve two stages, the first of which aims to identify textual elements (words and/or phrases) that may be relevant to the classification process. This stage often involves an analysis of the text that is both language-specific and possibly domain-specific, and may also be computationally costly. In this paper we examine a number of alternative keyword-generation methods and phrase-construction strategies that identify key words and phrases by simple, language-independent statistical properties. We present results that demonstrate that these methods can produce good classification accuracy, with the best results being obtained using a phrase-based approach.

机译：用于文本分类的算法通常涉及两个阶段，第一阶段旨在识别可能与分类过程相关的文本元素（单词和/或短语）。这个阶段通常涉及对文本的分析，该分析既是语言特定的，也可能是领域特定的，并且在计算上也可能是昂贵的。在本文中，我们研究了许多可替代的关键字生成方法和短语构造策略，它们通过简单的，与语言无关的统计属性来识别关键字和短语。我们目前的结果表明，这些方法可以产生良好的分类准确性，使用基于短语的方法可以获得最佳结果。

著录项

来源
《Machine Learning and Data Mining in Pattern Recognition(MLDM 2007); 20070718-20; Leipzig(DE)》|2007年|P.838-853|共16页
会议地点 Leipzig(DE)
作者
Frans Coenen; Paul Leng; Robert Sanderson; Yanbo J. Wang;
展开▼
作者单位

Depaxtment of Computer Science, The University of Liverpool, Ashton Building, Ashton Street, Liverpool L69 3BX, United Kingdom;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机的应用;
关键词
text classification; text preprocessing;

机译：文本分类；文本预处理;

相似文献

外文文献
中文文献
专利

1. Detection of Neutral Phrases and Polarity Shifting of Few Phrases for Effective Classification of Opinionated Texts [J] . M. K. Anil Kumar, Suresha International journal of computational intelligence research . 2010,第1期

机译：检测中性短语和少数短语的极性转换，以有效地对有目的的文本进行分类
2. Mitigating backdoor attacks in LSTM-based text classification systems by Backdoor Keyword Identification [J] . Chen Chuanshuai, Dai Jiazhu Neurocomputing . 2021,第Sepa10期

机译：通过Backdoor关键字识别缓解基于LSTM的文本分类系统的后门攻击
3. Learning Phrase Patterns for Text Classification [J] . Bin Zhang, Marin A., Hutchinson B., Audio, Speech, and Language Processing, IEEE Transactions on . 2013,第6期

机译：学习短语模式以进行文本分类
4. Statistical Identification of Key Phrases for Text Classification [C] . Frans Coenen, Paul Leng, Robert Sanderson, Machine Learning and Data Mining in Pattern Recognition International Conference . 2007

机译：文本分类关键短语的统计识别
5. Noun phrases in documents: Preprocessing, automatic extraction, and statistical analysis in different categories of text. [D] . Kim, Youngin. 2002

机译：文档中的名词短语：对不同类别的文本进行预处理，自动提取和统计分析。
6. A free-text processing system to capture physical findings: Canonical Phrase Identification System (CAPIS). [O] . R. Lin, L. Lenert, B. Middleton, 1991

机译：捕获物理发现的自由文本处理系统：规范短语识别系统（CAPIS）。
7. Statistical Identification of Key Phrases for Text Classification [O] . Frans Coenen, Paul Leng, Robert S, 2009

机译：文本分类关键短语的统计识别

Statistical Identification of Key Phrases for Text Classification

摘要

著录项

相似文献

相关主题

期刊订阅