首页> 外文OA文献 >An improved boosting algorithm and its application to automated text categorization
【2h】

An improved boosting algorithm and its application to automated text categorization

机译:改进的boost算法及其在文本自动分类中的应用

摘要

We describe AdaBoost.MH , an improved boosting al-gorithm, and its application to text categorization. Boostingis a method for supervised learning which has successfullybeen applied to many different domains, and that has provenone of the best performers in text categorization exercisesso far. Boosting is based on the idea of relying on the collec-tive judgment of a committee of classifiers that are trainedsequentially. In training the i-th classifier special emphasisis placed on the correct categorization of the training docu-ments which have proven harder for the previously trainedclassifiers. AdaBoost.MHKR is based on the idea to build,at every iteration of the learning phase, not a single classi-fier but a sub-committee of the K classifiers which, at thatiteration, look the most promising. We report the resultsof systematic experimentation of this method performed onthe standard Reuters-21578 benchmark. These experimentshave shown that AdaBoost.MHKR is both more efficient totrain and more effective than the original AdaBoost.MHRalgorithm.
机译:我们将介绍AdaBoost.MH(一种改进的增强算法)及其在文本分类中的应用。 Boosting是一种监督学习的方法,已成功应用于许多不同领域,并被证明是迄今为止在文本分类练习中表现最好的人。提升是基于这样的想法,即依靠顺序训练的分类委员会的集体判断。在训练第i个分类器时,重点特别放在训练文档的正确分类上,这对于以前训练的分类器来说更加困难。 AdaBoost.MHKR的思想是在学习阶段的每一次迭代中都建立一个K分类器的子委员会,而不是一个分类器,该分类器看起来是最有前途的。我们报告了在标准的Reuters-21578基准上对该方法进行系统实验的结果。这些实验表明,与原始的AdaBoost.MHR算法相比,AdaBoost.MHKR不仅训练效率更高,而且更有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号