首页> 中文期刊>计算机科学 >基于三元概念分析的文本分类算法研究

基于三元概念分析的文本分类算法研究

     

摘要

随着网络中三维数据的涌现,三元概念分析的优势也逐渐体现出来.三元概念分析是较新的研究领域,具有广阔的发展前景.提出基于三元概念分析的文本分类方法,该方法是一种全新的构思理念,是三元概念分析在应用上的拓展.该算法的主要思路是:首先将数据集预处理为三元背景,同时将背景中的二值关系扩展为0-1间的模糊关系,其用于表示特定条件下属性对于对象的隶属度,并基于此构建三元概念,利用三元概念表示数据集中文本、特征词与类别之间的三元关系;然后结合模糊理论中的贴近度,类比得出三元概念间的相似度,并运用相似性度量计算出训练集中三元概念与新文本的相似值.实验结果表明,文中所提模型是有效的,且在特定的数据集上相较于机器学习Support Vector Machine(SVM)算法、K-Nearest Neighbor(KNN)算法、卷积神经网络(CNN)算法以及基于形式概念分析的分类模型均有更好的分类效果.%With the emergence of three-dimensional data in the network,the advantages of triadic concept analysis (TCA) have been reflected gradually.As a relatively new field,TCA has a bright prospect.This paper proposed a text classification algorithm based on TCA,which is a novel idea and a development of TCA in application aspect.The main idea of this algorithm is firstly preprocessing the dataset so that we can convert it into triadic context,meanwhile extend the binary relation in the context to a fuzzy value between 0-1 which represents membership degree about attribute for object under certain conditions.Based on this,we can build triadic concepts and utilize it to express the ternary relation among text,term and category.Then,combined with the approach degree in fuzzy theory,we can analogize the similarity formula of triadic concepts,accordingly calculate the training set's similar value about triadic concept for a new text.Compared to support vector machine(SVM),K-nearest neighbor (KNN),convolution neural network (CNN) algorithm and classification based on formal concept analysis model,the results indicate that the proposed model in specific dataset is effective and achieves a better performance.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号