首页> 外文会议>International Conference on Computing, Communication and Automation >An ensemble based NLP feature assessment in binary classification

【24h】

An ensemble based NLP feature assessment in binary classification

机译：二进制分类中基于整体的NLP特征评估

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text feature selection plays an important role in text mining. Terms are the key players in document representation. The document representation can help application in following areas-indexing, summarization, classification, clustering and filtering. Text instances come with a challenge of high dimensional feature space and using such features can be extremely useful in text analysis. Hence it is important to extract important terms from a document. In this paper, we examine the impact of NLP features (stop words, stemmer and combination of both) on predictive performance of base classifiers and ensembles of Naive Bayesian category. We selected different category of base classifier like NB, SVM, KNN and J48 as these are frequently used by the researchers in text mining. IMBD movie review dataset is used as a standard dataset for experimental work. We prepared ensembles of Naive Bayesian with base classifiers and found ensemble gives better performance over the base classifiers with entire NLP categorical dataset. Ensemble of NB with SVM out performed among other ensembles with different categorical dataset.

机译：文本特征选择在文本挖掘中起着重要作用。术语是文档表示中的关键角色。文档表示可以帮助在以下领域中应用：索引编制，摘要，分类，聚类和过滤。文本实例面临着高维特征空间的挑战，使用这些特征在文本分析中可能非常有用。因此，从文档中提取重要术语很重要。在本文中，我们研究了NLP特征（停用词，词干和二者的组合）对朴素贝叶斯类别的基础分类器和合奏的预测性能的影响。我们选择了不同类别的基础分类器，如NB，SVM，KNN和J48，这是研究人员在文本挖掘中经常使用的分类器。 IMBD电影评论数据集用作实验工作的标准数据集。我们使用基本分类器准备了朴素贝叶斯合奏，发现与整个NLP分类数据集相比，集成能提供比基本分类器更好的性能。带有SVM的NB集成与其他具有不同分类数据集的集成一起执行。

著录项

来源
《International Conference on Computing, Communication and Automation 》|2017年|345-349|共5页
会议地点
作者
Saurabh Kr. Srivasatava; Roshan Kumari; Sandeep Kr. Singh;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Text categorization; Support vector machines; Niobium; Motion pictures; Feature extraction; Text mining;

机译：文本分类;支持向量机;铌;运动图像;特征提取;文本挖掘;

相似文献

外文文献
中文文献
专利

1. Featured Clustering and Ranking-Based Bad Cluster Removal for Hyperspectral Band Selection and Classification Using Ensemble of Binary SVM Classifiers [J] . Kalidindi Kishore Raju, Varma Pardha Saradhi G., Davuluri Rajyalakshmi International journal of information technology project management . 2021 ,第4期

机译：使用二进制SVM分类器的集群选择和基于排序的基于群集和基于排序的群集群集删除
2. Compensation of feature selection biases accompanied with improved predictive performance for binary classification by using a novel ensemble feature selection approach [J] . Ursula Neumann, Mona Riemenschneider, Jan-Peter Sowa, BioData Mining . 2016 ,第1期

机译：通过使用一种新颖的整体特征选择方法来补偿特征选择偏差并改善二进制分类的预测性能
3. Boosting Ensembles of Heavy Two-Layer Perceptrons for Increasing Classification Accuracy in Recognizing Shifted-Turned-Scaled Flat Images with Binary Features [J] . Vadim V. Romanuke Journal of Information and Organizational Sciences . 2015 ,第1期

机译：增强重型两层感知器的集合以提高识别具有二进制特征的平移缩放平面图像时的分类精度
4. An ensemble based NLP feature assessment in binary classification [C] . Saurabh Kr. Srivasatava, Roshan Kumari, Sandeep Kr. Singh International Conference on Computing, Communication and Automation . 2017

机译：二进制分类的基于合奏的NLP特征评估
5. Approaches to Feature Identification and Feature Selection for Binary and Multi-Class Classification [D] . Zhang, Zisheng. 2007

机译：二元和多类分类的特征识别和特征选择方法
6. Compensation of feature selection biases accompanied with improved predictive performance for binary classification by using a novel ensemble feature selection approach [O] . Ursula Neumann, Mona Riemenschneider, Jan-Peter Sowa, 2016

机译：通过使用新颖的集成特征选择方法补偿特征选择偏差并改善二进制分类的预测性能
7. Compensation of feature selection biases accompanied with improved predictive performance for binary classification by using a novel ensemble feature selection approach [O] . Ursula Neumann, Mona Riemenschneider, Jan-Peter Sowa, 2016

机译：通过使用新颖的集成特征选择方法，补偿特征选择偏差并改善二进制分类的预测性能

An ensemble based NLP feature assessment in binary classification

摘要

著录项

相似文献

相关主题

期刊订阅