一种基于反向文本频率互信息的文本挖掘算法研究

周戈

首页> 中文期刊>计算机应用研究 >一种基于反向文本频率互信息的文本挖掘算法研究

一种基于反向文本频率互信息的文本挖掘算法研究

开具论文收录证明 >>

期刊封面封底目录下载 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In view of the traditional text classification algorithm has the characteristics of classification results on the influence of the same, the classification accuracy rate is low, caused at the same time algorithm time complexity increases, based on the analysis of the text classification system of the general model, as well as in the application of mutual information feature extraction method based on feature, this paper put forward a method based on reverse text frequency mutual information entropy text classification algorithm. The algorithm first used based on the VSM on the text sample vector feature extraction, then the text imaged to extract keywords set, selection of key words in the text, using mutual information to represent and computational lexicon and document classification correlation, finally calculated key words in the document weight. The experimental results show that the proposed algorithm and the traditional classification algorithm, has high computing speed and strong nonlinear mapping ability, the speed of convergence and accuracy are better classification effect.%针对传统的文本分类算法存在着各特征词对分类结果的影响相同,分类准确率较低,同时造成了算法时间复杂度的增加,在分析了文本分类系统的一般模型,以及在应用了互信息量的特征提取方法提取特征项的基础上,提出一种基于反向文本频率互信息熵文本分类算法.该算法首先采用基于向量空间模型(vector space model,VSM)对文本样本向量进行特征提取；然后对文本信息提取关键词集,筛选文本中的关键词,采用互信息来表示并计算词汇与文档分类相关度；最后计算关键词在文档中的权重.实验结果表明了提出的改进算法与传统的分类算法相比,具有较高的运算速度和较强的非线性映射能力,在收敛速度和准确程度上也有更好的分类效果.

著录项

来源
《计算机应用研究》|2012年第2期|487-489|共3页
作者
周戈;
展开▼
作者单位

重庆青年职业技术学院,重庆400712;

展开▼
原文格式 PDF
正文语种 chi
中图分类信息处理（信息加工）;
关键词
文本挖掘; 互信息; 向量空间模型; 权重;

相似文献

中文文献
外文文献
专利

1. 一种基于新的特征选择的海量网络文本挖掘算法研究 [J] . 张人上 ,曲开社 . 计算机应用研究 . 2014,第009期
2. 基于Web文本挖掘中的一种中文分词算法研究 [J] . 谢红薇 ,王栋 . 电脑开发与应用 . 2007,第007期
3. 一种基于互信息的文本聚类算法研究 [J] . 周成福 . 电子技术与软件工程 . 2015,第009期
4. 基于文本挖掘词频反文档频率方法的疾病症状权重挖掘研究 [J] . 宋艳 ,何嘉 ,舒红平 . 成都信息工程学院学报 . 2014,第001期
5. 基于LDA模型的大规模文本挖掘算法研究 [J] . 董薇 ,庞峰 ,顾炜江 . 软件 . 2020,第012期
6. 一种基于文本挖掘的软件失效模式自动生成方法 [C] . MENG Lingzhong ,孟令中 ,WANG Hang . 第十六届全国软件与应用学术会议 . 2017
7. 文本挖掘中基于对比分析的潜在方面观点算法研究 [A] . 孙绍华 . 2019

一种基于反向文本频率互信息的文本挖掘算法研究

摘要

著录项

相似文献

相关主题

期刊订阅