Effect of stop word removal on the performance of naive Bayesian methods for text classification in the Kannada language

R. Jayashree; K. Srikanta Murthy; Basavaraj S. Anami

首页> 外文期刊>International journal of artificial intelligence and soft computing >Effect of stop word removal on the performance of naive Bayesian methods for text classification in the Kannada language

【24h】

Effect of stop word removal on the performance of naive Bayesian methods for text classification in the Kannada language

机译：停用词移除对卡纳达语文本分类的朴素贝叶斯方法性能的影响

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Stop words are high frequency words in a document, which add unrealistic requirement on the classifier, both in terms of time and space complexity. There has been considerable amount of work done in information retrieval in English, but information retrieval in the Kannada language is a new concept. The identification and removal of stop words in the Kannada language could be an important piece of work, as elimination of stop words would definitely reduce the feature space, which in turn would help in reducing time and space complexity. It is to be noted that, there is no standard stop word list in the Kannada language. This warrants us to take up this task of developing an algorithm for removing structurally similar stop words. The stop word removal though reduces feature space, may not contribute to the improvement in the performance of the classifiers as is evident from our results.

机译：停用词是文档中的高频词，从时间和空间复杂度两方面对分类器增加了不切实际的要求。用英语进行信息检索已经做了大量工作，但是用卡纳达语进行信息检索是一个新概念。卡纳达语中的停用词的识别和删除可能是一项重要的工作，因为消除停用词肯定会减少特征空间，进而有助于减少时间和空间复杂度。要注意的是，在卡纳达语中没有标准的停用词列表。这使我们能够承担开发消除结构上相似的停用词的算法的任务。从我们的结果可以明显看出，去除停用词虽然会减少特征空间，但可能不会有助于提高分类器的性能。

著录项

来源
《International journal of artificial intelligence and soft computing》 |2014年第3期|264-282|共19页
作者
R. Jayashree; K. Srikanta Murthy; Basavaraj S. Anami;
展开▼
作者单位

Department of Computer Science and Engineering, PES Institute of Technology, 100 ft. Ring Road, BSK Ⅲ stage, Bangalore-560085, India;

Department of Computer Science and Engineering, PES Institute of Technology, 100 ft. Ring Road, BSK Ⅲ stage, Bangalore-560085, India;

KLE Institute of Engineering, Hubli, India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
performance; classifier; stop word; indexing; information retrieval;

机译：性能;分类器停用词索引信息检索;

相似文献

外文文献
中文文献
专利

1. Suitability of Na?ve Bayesian Methods for Paragraph Level Text Classification in the Kannada Language using Dimensionality Reduction Technique [J] . Jayashree R, Srikantamurthy K, Basavaraj S Anami International Journal of Artificial Intelligence & Applications (IJAIA) . 2013,第5期

机译：朴素贝叶斯方法对降维技术在卡纳达语中段落级文本分类中的适用性
2. An optimization text summarization method based on naive Bayes and topic word for single syllable language [J] . Ha Nguyen Thi Thu Applied mathematical sciences . 2014,第3期

机译：基于朴素贝叶斯和主题词的单音节语言优化文本摘要方法
3. Sentence level text classification in the kannada language - a classifier's perspective [J] . R. Jayashree, K. Srikantamurthy, Basavaraj S. Anami International journal of computational vision and robotics . 2015,第3期

机译：卡纳达语中的句子级文本分类-分类者的观点
4. Research on Chinese text classification based on Naive Bayesian method [C] . Geng Xinglong, Gao Xiuyan, Zhao Bin Proceedings of the Fifth international symposium on test automation amp; instrumentation . 2014

机译：基于朴素贝叶斯方法的中文文本分类研究
5. Identification of secondary and tertiary motifs in DNA sequences through naive Bayesian text classification. [D] . Villalobos, Rodney V. 2007

机译：通过朴素的贝叶斯文本分类识别DNA序列中的二级和三级基序。
6. Building an Ensemble of Fine-Tuned Naive Bayesian Classifiers for Text Classification [O] . Khalil El Hindi, Hussien AlSalman, Safwan Qasem, 2018

机译：建立一个微调朴素贝叶斯分类器的集合用于文本分类
7. Suitability of Naïve Bayesian Methods for Paragraph Level Text Classification in the Kannada Language using Dimensionality Reduction Technique [O] . Jayashree R, Srikantamurthy K, Basavaraj S Anami 2013

机译：使用降维技术将纯朴素贝叶斯方法应用于卡纳达语中的段落级文本分类

Effect of stop word removal on the performance of naive Bayesian methods for text classification in the Kannada language

摘要

著录项

相似文献

相关主题

期刊订阅