首页> 外文会议>Asia-Pacific Web Conference; 20050329-0401; Shanghai(CN) >An Incremental Subspace Learning Algorithm to Categorize Large Scale Text Data
【24h】

An Incremental Subspace Learning Algorithm to Categorize Large Scale Text Data

机译:一种用于大规模文本数据分类的增量子空间学习算法

获取原文
获取原文并翻译 | 示例

摘要

The dramatic growth in the number and size of on-line information sources has fueled increasing research interest in the incremental subspace learning problem. In this paper, we propose an incremental supervised subspace learning algorithm, called Incremental Inter-class Scatter (IIS) algorithm. Unlike traditional batch learners, IIS learns from a stream of training data, not a set. IIS overcomes the inherent problem of some other incremental operations such as Incremental Principal Component Analysis (PCA) and Incremental Linear Discriminant Analysis (LDA). The experimental results on the synthetic datasets show that IIS performs as well as LDA and is more robust against noise. In addition, the experiments on the Reuters Corpus Volume 1 (RCV1) dataset show that IIS outperforms state-of-the-art Incremental Principal Component Analysis (IPCA) algorithm, a related algorithm, and Information Gain in efficiency and effectiveness respectively.
机译:在线信息源的数量和规模的急剧增长,激发了人们对增量子空间学习问题的研究兴趣。在本文中,我们提出了一种增量监督子空间学习算法,称为增量类间散点(IIS)算法。与传统的批处理学习器不同,IIS是从培训数据流而非集合中学习的。 IIS克服了其他一些增量操作的固有问题,例如增量主成分分析(PCA)和增量线性判别分析(LDA)。综合数据集上的实验结果表明,IIS的性能与LDA一样好,并且对噪声的抵抗力更强。此外,对路透社语料库第1卷(RCV1)数据集的实验表明,IIS在效率和有效性方面分别优于最新的增量主成分分析(IPCA)算法,相关算法和信息增益。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号