High Dimensional Text Document Clustering and Classification using Machine Learning Methods

机译：使用机器学习方法的高维文本文档群集和分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Although High Dimensional documents are used for classification, document dimensions are a major concern and a worrying sign. The effects of Dimensional reduction can be both the positive and negative aspects. If a dimensional reduction is not in a correct format, classification using lower-dimensional documents will not produce the desired results. Our research studies centered on dimensional reduction using the fundamental similarity property, but they haven't yet discussed text classification. This approach discusses the classification and clustering task by providing various algorithms. We used clustering and classification methods on various datasets in this paper. It also suggested improving the efficiency of the K-means clustering and the Naive Bayes classification technique. Popular parameters like precision, recall, and F-score are used to evaluate the recommended method's performance. The results of the experiments would demonstrate that the proposed model outperforms current algorithms.

机译：虽然高维文档用于分类，但文件尺寸是一个主要关注和令人担忧的标志。尺寸减小的效果可以是正面和负面方面。如果尺寸减少不是正确的格式，则使用较低维文档的分类将不会产生所需的结果。我们的研究研究以基本的相似性为中心的维度减少，但他们还没有讨论过文本分类。该方法通过提供各种算法来讨论分类和聚类任务。我们在本文中使用了各种数据集的聚类和分类方法。它还建议提高K-Means聚类和朴素贝叶斯分类技术的效率。 Precision，Recall和F分数的流行参数用于评估推荐的方法的性能。实验结果将证明所提出的模型优于当前算法。

著录项

来源
《International Conference on Intelligent Computing and Control Systems》|2021年|1612-1617|共6页
会议地点
作者
Vinay Kumar Kotte; Shankar Vuppu; Rachana Thadishetti;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Machine learning algorithms; Text recognition; Text categorization; Natural languages; Clustering algorithms; Machine learning; Control systems;

机译：机器学习算法;文本识别;文本分类;自然语言;聚类算法;机器学习;控制系统;

相似文献

外文文献
中文文献
专利

1. On the influence of training data quality on text document classification using machine learning methods [J] . Jyri Saarikoski, Henry Joutsijoki, Kalervo Jaervelin, International Journal of Knowledge Engineering and Data Mining . 2015,第2期

机译：训练数据质量对机器学习方法对文本文档分类的影响
2. SVM based adaptive learning method for text classification from positive and unlabeled documents [J] . Tao Peng, Wanli Zuo, Fengling He Knowledge and information systems . 2008,第3期

机译：基于支持向量机的自适应学习方法从正向和未标记文档中进行文本分类
3. SVM based adaptive learning method for text classification from positive and unlabeled documents [J] . Tao Peng, Wanli Zuo, Fengling He Knowledge and Information Systems . 2008,第3期

机译：基于支持向量机的自适应学习方法从正向和未标记文档中进行文本分类
4. A Low-Dimensional Representation Learning Method for Text Classification and Clustering [C] . Xiang Wang, Yunfan Liao, Junxing Zhu, IEEE International Conference on Data Science in Cyberspace . 2020

机译：一种用于文本分类和聚类的低维表示学习方法
5. Machine Learning and Text Analysis Using Clustering, Classification, Categorization for Applied Industry Research and Its Effect on Trends and Prediction Analysis of a Doctor of Professionals Studies in Computing Dissertation Categories [D] . Haigler, Ashley. 2021

机译：采用集群，分类，分类，应用行业研究的机器学习和文本分析及其对计算论文中专业人士研究博士趋势和预测分析的影响
6. What is relevant in a text document?: An interpretable machine learning approach [O] . Leila Arras, Franziska Horn, Grégoire Montavon, -1

机译：文本文档中有什么相关内容？：一种可解释的机器学习方法
7. Classification of aortic stenosis using conventional machine learning and deep learning methods based on multi-dimensional cardio-mechanical signals [O] . Chenxi Yang, Banish D. Ojha, Nicole D. Aranoff, 2020

机译：基于多维心电图信号的传统机器学习和深层学习方法分类主动脉狭窄

High Dimensional Text Document Clustering and Classification using Machine Learning Methods

摘要

著录项

相似文献

相关主题

期刊订阅