首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >Higher Order Naïve Bayes: A Novel Non-IID Approach to Text Classification
【24h】

Higher Order Naïve Bayes: A Novel Non-IID Approach to Text Classification

机译:高阶朴素贝叶斯:一种新颖的非IID文本分类方法

获取原文
获取原文并翻译 | 示例

摘要

The underlying assumption in traditional machine learning algorithms is that instances are Independent and Identically Distributed (IID). These critical independence assumptions made in traditional machine learning algorithms prevent them from going beyond instance boundaries to exploit latent relations between features. In this paper, we develop a general approach to supervised learning by leveraging higher order dependencies between features. We introduce a novel Bayesian framework for classification termed Higher Order Naïve Bayes (HONB). Unlike approaches that assume data instances are independent, HONB leverages higher order relations between features across different instances. The approach is validated in the classification domain on widely used benchmark data sets. Results obtained on several benchmark text corpora demonstrate that higher order approaches achieve significant improvements in classification accuracy over the baseline methods, especially when training data is scarce. A complexity analysis also reveals that the space and time complexity of HONB compare favorably with existing approaches.
机译:传统机器学习算法的基本假设是实例是独立且完全相同的(IID)。传统机器学习算法中做出的这些关键的独立性假设阻止了它们超越实例边界来利用特征之间的潜在关系。在本文中,我们通过利用特征之间的更高阶依赖性来开发一种监督学习的通用方法。我们介绍了一种新颖的贝叶斯分类框架,称为高阶朴素贝叶斯(HONB)。与假设数据实例是独立的方法不同,HONB利用跨不同实例的要素之间的高级关系。该方法已在分类域中针对广泛使用的基准数据集进行了验证。在几种基准文本语料库上获得的结果表明,与基线方法相比,高阶方法在分类准确性上有显着提高,尤其是在培训数据稀缺的情况下。复杂度分析还显示,HONB的时空复杂度与现有方法相比具有优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号