【24h】

A Bayesian Classifier for Learning from Tensorial Data

机译:贝叶斯分类器用于张量数据学习

获取原文

摘要

Traditional machine learning methods characterize data observations by feature vectors, where an entry of a vector denotes a scalar feature value of a data instance. While this data representation facilitates the application of conventional machine learning algorithms, in many cases it is not the best way of extracting all useful information from the data observations. In this paper we relax the (often unstated) assumption of vectorizing features of data instances, and allow a more natural representation of the data in a tensor format. Tensors are multi-mode (aka multi-way) arrays, of whom vectors (i.e., one-mode tensors) and matrices (i.e., two-mode tensors) are special cases. We show that the tensor representation captures useful information that is difficult to provide in the conventional vector format. More importantly, to effectively utilize the rich information contained in tensors, we propose a novel semi-naive Bayesian tensor classification method (which we call Bat) that builds predictive models directly on data in tensor form (instead of on their vectorizations). We apply Bat to supervised learning problems, and perform comprehensive experiments on classifying text documents and graphs, which demonstrate (1) the advantage of the tensor representation over conventional feature-vectorization approaches, and (2) the superiority of the proposed Bat tensor classifier over other existing learners.
机译:传统的机器学习方法通​​过特征向量来表征数据观测,其中向量的条目表示数据实例的标量特征值。尽管此数据表示有助于常规机器学习算法的应用,但在许多情况下,这并不是从数据观察中提取所有有用信息的最佳方法。在本文中,我们放宽了对数据实例的特征进行矢量化的(通常是未声明的)假设,并允许以张量格式更自然地表示数据。张量是多模(aka多向)数组,其中向量(即一模张量)和矩阵(即二模张量)是特例。我们表明,张量表示法捕获了有用的信息,而这些信息很难以传统的矢量格式提供。更重要的是,为了有效利用张量中包含的丰富信息,我们提出了一种新颖的半朴素贝叶斯张量分类方法(我们称为Bat),该方法直接基于张量形式的数据(而不是其矢量化)建立预测模型。我们将Bat应用于有监督的学习问题,并对文本文档和图形进行分类的综合实验,证明了(1)张量表示相对于传统特征向量化方法的优势,以及(2)拟议的Bat张量分类器优于其他现有的学习者。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号