首页> 外文学位 >Learning Distributed Representations for Statistical Language Modelling and Collaborative Filtering.
【24h】

Learning Distributed Representations for Statistical Language Modelling and Collaborative Filtering.

机译:学习用于统计语言建模和协同过滤的分布式表示形式。

获取原文
获取原文并翻译 | 示例

摘要

With the increasing availability of large datasets machine learning techniques are be- coming an increasingly attractive alternative to expert-designed approaches to solving complex problems in domains where data is abundant. In this thesis we introduce several models for large sparse discrete datasets. Our approach, which is based on probabilistic models that use distributed representations to alleviate the effects of data sparsity, is applied to statistical language modelling and collaborative filtering.;To reduce the time complexity of training and making predictions with the deterministic model, we introduce a hierarchical version of the model, that can be exponentially faster. The speedup is achieved by structuring the vocabulary as a tree over words and taking advantage of this structure. We propose a simple feature-based algorithm for automatic construction of trees over words from data and show that the resulting models can outperform non-hierarchical neural models as well as the best n-gram models.;We then turn our attention to collaborative filtering and show how RBM models can be used to model the distribution of sparse high-dimensional user rating vectors efficiently, presenting inference and learning algorithms that scale linearly in the number of observed ratings. We also introduce the Probabilistic Matrix Factorization model which is based on the probabilistic formulation of the low-rank matrix approximation problem for partially observed matrices. The two models are then extended to allow conditioning on the identities of the rated items whether or not the actual rating values are known. Our results on the Netflix Prize dataset show that both RBM and PMF models outperform online SVD models.;We introduce three probabilistic language models that represent words using learned real-valued vectors. Two of the models are based on the Restricted Boltzmann Machine (RBM) architecture while the third one is a simple deterministic model. We show that the deterministic model outperforms the widely used n-gram models and learns sensible word representations.
机译:随着大型数据集可用性的提高,机器学习技术正在成为专家设计的解决方案的一种有吸引力的替代方法,以解决数据丰富的领域中的复杂问题。在本文中,我们介绍了几种大型稀疏离散数据集的模型。我们的方法基于概率模型,该概率模型使用分布式表示来减轻数据稀疏性的影响,被应用于统计语言建模和协作过滤。;为了减少使用确定性模型进行训练和做出预测的时间复杂性,我们引入了模型的分层版本,可以指数级地更快。通过将词汇构建为单词之上的树并利用此结构来实现加速。我们提出了一种基于特征的简单算法,用于根据数据中的单词自动构建树,并证明了所得模型的性能优于非分层神经模型以及最佳n-gram模型。然后我们将注意力转向协作过滤和展示了如何使用RBM模型有效地对稀疏高维用户评分向量的分布进行建模,并提出了推理和学习算法,这些算法在观察到的评分数量上呈线性比例。我们还介绍了概率矩阵分解模型,该模型基于对部分观测矩阵的低秩矩阵逼近问题的概率公式。然后扩展这两个模型,以允许以额定项目的身份为条件,而不管实际额定值是否已知。我们在Netflix Prize数据集上的结果表明,RBM和PMF模型均优于在线SVD模型。;我们引入了三种概率语言模型,这些模型使用学习的实值向量表示单词。其中两个模型基于受限玻尔兹曼机(RBM)体系结构,而第三个模型是简单的确定性模型。我们表明确定性模型优于广泛使用的n-gram模型,并学习了明智的单词表示形式。

著录项

  • 作者

    Mnih, Andriy.;

  • 作者单位

    University of Toronto (Canada).;

  • 授予单位 University of Toronto (Canada).;
  • 学科 Artificial Intelligence.;Computer Science.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 137 p.
  • 总页数 137
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号