ENSEMBLE MULTI-LABEL TEXT CATEGORIZATION BASED ON PYRAMIDAL CLUSTER MEMBERSHIP APPROACH

J. STALIN JOSE; DR. P. SURESH

首页> 外文期刊>Journal of Theoretical and Applied Information Technology >ENSEMBLE MULTI-LABEL TEXT CATEGORIZATION BASED ON PYRAMIDAL CLUSTER MEMBERSHIP APPROACH

【24h】

ENSEMBLE MULTI-LABEL TEXT CATEGORIZATION BASED ON PYRAMIDAL CLUSTER MEMBERSHIP APPROACH

机译：基于金字塔聚类成员方法的可封装多标签文本分类

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text Categorization is an interesting field in the study of Textual Data Mining. It has attracted an increasing popularity with its explosive growth of textual documents. The documents are connected with exclusive multitude categories i.e sports, medical, health, and Olympic Games). Text categorization paves different opportunities for creating multi-label learning approaches that specifically to textual data. Text mining defines the processes of discovering useful knowledge patterns from textual data. This is one of the factors followed in automated text categorization. It is practiced by developing novel machine learning approaches. Anyhow, the ML model generates low expressivity. The ML model established using Train-Test scenario. In case the existing model is found deficient, the Train-Test-Retrain is developed which is time consuming process. In this paper, we proposed ?Pyramidal Cluster Membership Approach (PCMO)?. It works in two models namely, training and testing model. The training model comprised of four phases, Pyramid-Fuzzy Transmutation, Novel k-edge classifier, Cluster to Category mapping and finding the boundaries. These estimated boundaries are applied on new textual data and the categories are assigned. Experimental results on Freebase dataset show that the proposed approach based on pyramidal membership method can achieve better classification accuracy than the traditional approaches especially that includes over-fitting document categories.

机译：文本分类是文本数据挖掘研究中一个有趣的领域。随着文本文件的爆炸性增长，它越来越受到人们的欢迎。这些文档与众多专有类别相关，例如体育，医疗，保健和奥运会。文本分类为创建专门针对文本数据的多标签学习方法提供了不同的机会。文本挖掘定义了从文本数据中发现有用的知识模式的过程。这是自动文本分类中遵循的因素之一。通过开发新颖的机器学习方法来实践它。无论如何，ML模型会产生低表现力。使用训练测试场景建立的ML模型。如果发现现有模型不足，则开发训练-测试-再训练，这是耗时的过程。在本文中，我们提出了“金字塔形聚类成员方法（PCMO）”。它以两种模型工作，即训练和测试模型。训练模型包括四个阶段，金字塔-模糊变换，新颖的k边缘分类器，聚类到类别映射和查找边界。这些估计的边界将应用于新的文本数据，并指定类别。在Freebase数据集上的实验结果表明，与传统方法相比，该方法基于金字塔隶属度方法可以实现更好的分类精度，特别是在包含过度拟合的文档类别时。

著录项

来源
《Journal of Theoretical and Applied Information Technology》 |2017年第12期|共1页
作者
J. STALIN JOSE; DR. P. SURESH;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. ENSEMBLE MULTI-LABEL TEXT CATEGORIZATION BASED ON PYRAMIDAL CLUSTER MEMBERSHIP APPROACH [J] . J. STALIN JOSE, DR. P. SURESH Journal of Theoretical and Applied Information Technology . 2017,第12期

机译：基于金字塔聚类成员方法的可封装多标签文本分类
2. Solving multi-label text categorization problem using support vector machine approach with membership function [J] . Tai-Yue Wang, Huei-Min Chiang Neurocomputing . 2011,第17期

机译：使用具有隶属度函数的支持向量机方法解决多标签文本分类问题
3. Ensemble multi-label text categorization based on rotation forest and latent semantic indexing [J] . Elghazel Haytham, Aussem Alex, Gharroudi Ouadie, Expert Systems with Application . 2016,第Sepa期

机译：基于旋转森林和潜在语义索引的多标签文本分类
4. Multi-label text categorization based on feature optimization using ant colony optimization and relevance clustering technique [C] . Puneet Nema, Vivek Sharma 2015 International Conference on Computers, Communications, and Systems . 2015

机译：基于特征优化的蚁群优化和关联聚类技术的多标签文本分类
5. Induction in hierarchical multi-label domains with focus on text categorization. [D] . Dendamrongvit, Sareewan. 2011

机译：归纳多层标签域，重点关注文本分类。
6. Biomedical Text Categorization Based on Ensemble Pruning and Optimized Topic Modelling [O] . Aytuğ Onan 2018

机译：基于集合修剪和优化主题建模的生物医学文本分类
7. A MFoM Learning Approach to Robust Multiclass Multi-Label Text Categorization [O] . Sheng Gao Gaosheng, Wen Wu, Chin-hui Lee 2004

机译：一种稳健的多类多标签文本分类的mFom学习方法

ENSEMBLE MULTI-LABEL TEXT CATEGORIZATION BASED ON PYRAMIDAL CLUSTER MEMBERSHIP APPROACH

摘要

著录项

相似文献

相关主题

期刊订阅