...
首页> 外文期刊>Applied Soft Computing >Taking advantage of improved resource allocating network and latent semantic feature selection approach for automated text categorization
【24h】

Taking advantage of improved resource allocating network and latent semantic feature selection approach for automated text categorization

机译:利用改进的资源分配网络和潜在语义特征选择方法进行自动文本分类

获取原文
获取原文并翻译 | 示例
           

摘要

In this study we propose an improved learning algorithm based on resource allocating network (RAN) for text categorization. RAN is a promising neural network of single hidden layer structure based on radial basis function. We firstly use the means clustering-based method to determine the initial centers in the hidden layer. Such method can effectively overcome the limitation of local-optimal of clustering algorithms. Subsequently, in order to improve the novelty criteria of RAN, we propose a root mean square (RMS) sliding window method which can reduce the underlying influence of undesirable noise data. Through the further research on the learning process of RAN, we divide the learning process of RAN into a preliminary study phase and a subsequent study phase. The former phase initializes the preliminary structure of RAN and decreases the complexity of network, while the latter phase refines its learning ability and improves the classification accuracy. Such a compact network structure decreases the computational complexity and maintains the higher convergence rate. Moreover, a latent semantic feature selection method is utilized to organize documents. This method reduces the input scale of network, and reveals the latent semantics between features. Extensive experiments are conducted on two benchmark datasets, and the results demonstrate the superiority of our algorithm in comparison with state of the art text categorization algorithms.
机译:在这项研究中,我们提出了一种基于资源分配网络(RAN)的改进的学习算法,用于文本分类。 RAN是一种基于径向基函数的有前途的单隐藏层结构神经网络。我们首先使用基于均值聚类的方法来确定隐藏层中的初始中心。该方法可以有效克服聚类算法局部最优的局限性。随后,为了提高RAN的新颖性标准,我们提出了一种均方根(RMS)滑动窗口方法,该方法可以减少不良噪声数据的潜在影响。通过对RAN学习过程的进一步研究,我们将RAN的学习过程分为初步学习阶段和后续学习阶段。前一个阶段初始化RAN的初始结构并降低网络的复杂性,而后一个阶段完善其学习能力并提高分类精度。这种紧凑的网络结构降低了计算复杂度,并保持了较高的收敛速度。此外,利用潜在语义特征选择方法来组织文档。该方法减小了网络的输入规模,并揭示了特征之间的潜在语义。在两个基准数据集上进行了广泛的实验,结果证明了我们的算法与最新的文本分类算法相比的优越性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号