首页> 外文OA文献 >A Confident Information First Principle for Parametric Reduction and Model Selection of Boltzmann Machines
【2h】

A Confident Information First Principle for Parametric Reduction and Model Selection of Boltzmann Machines

机译:Boltzmann机器参数约简和模型选择的信心第一原理

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Typical dimensionality reduction (DR) methods are data-oriented, focusing on directly reducing the number of random variables (or features) while retaining the maximal variations in the high-dimensional data. Targeting unsupervised situations, this paper aims to address the problem from a novel perspective and considers model-oriented dimensionality reduction in parameter spaces of binary multivariate distributions. Specifically, we propose a general parameter reduction criterion, called Confident-Information-First (CIF) principle, to maximally preserve confident parameters and rule out less confident ones. Formally, the confidence of each parameter can be assessed by its contribution to the expected Fisher information distance within a geometric manifold over the neighbourhood of the underlying real distribution. Then we demonstrate two implementations of CIF in different scenarios. First, when there are no observed samples, we revisit the Boltzmann Machines (BM) from a model selection perspective and theoretically show that both the fully visible BM (VBM) and the BM with hidden units can be derived from the general binary multivariate distribution using the CIF principle. This finding would help us uncover and formalize the essential parts of the target density that BM aims to capture and the non-essential parts that BM should discard. Second, when there exist observed samples, we apply CIF to the model selection for BM, which is in turn made adaptive to the observed samples. The sample-specific CIF is a heuristic method to decide the priority order of parameters, which can improve the search efficiency without degrading the quality of model selection results as shown in a series of density estimation experiments.
机译:典型的降维(DR)方法是面向数据的,着重于直接减少随机变量(或特征)的数量,同时保留高维数据中的最大变化。针对无人监督的情况,本文旨在从新颖的角度解决该问题,并考虑了二进制多元分布参数空间中面向模型的降维。具体来说,我们提出了一种通用的参数约简准则,称为“可信信息优先”(CIF)原则,以最大程度地保留可信参数并排除可信度较低的参数。形式上,每个参数的置信度可以通过其对基础实数分布附近的几何流形内的预期Fisher信息距离的贡献来评估。然后,我们演示了在不同场景下的两种CIF实现。首先,当没有观察到的样本时,我们从模型选择的角度重新审视玻尔兹曼机器(Boltzmann Machines),并从理论上证明完全可见的BM(VBM)和带有隐藏单元的BM都可以从通用二元多元分布使用CIF原则。这一发现将帮助我们发现并确定BM旨在捕获的目标密度的基本部分以及BM应该丢弃的非必要部分。其次,当存在观察到的样本时,我们将CIF应用于BM的模型选择,从而使之适应于观察到的样本。特定于样本的CIF是一种确定参数优先级顺序的启发式方法,如一系列密度估计实验所示,它可以提高搜索效率而不会降低模型选择结果的质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号