首页> 外文会议>European conference on machine learning and principles and practice of knowledge discovery in databases >CON-S2V: A Generic Framework for Incorporating Extra-Sentential Context into Sen2Vec
【24h】

CON-S2V: A Generic Framework for Incorporating Extra-Sentential Context into Sen2Vec

机译:CON-S2V:用于将超句上下文整合到Sen2Vec中的通用框架

获取原文

摘要

We present a novel approach to learn distributed representation of sentences from unlabeled data by modeling both content and context of a sentence. The content model learns sentence representation by predicting its words. On the other hand, the context model comprises a neighbor prediction component and a regularizer to model distributional and proximity hypotheses, respectively. We propose an online algorithm to train the model components jointly. We evaluate the models in a setup, where contextual information is available. The experimental results on tasks involving classification, clustering, and ranking of sentences show that our model outperforms the best existing models by a wide margin across multiple datasets. Code related to this chapter is available at: https://github.com/tksaha/con-s2v/tree/jointlearning Data related to this chapter are available at: https://www.dropbox.com/ sh/ruhsi3c0unn0nko/AAAgVnZpojvXx91oQ21WP_MYa?dl=0.
机译:我们提出了一种新颖的方法,通过对句子的内容和上下文进行建模,可以从未标记的数据中学习句子的分布式表示形式。内容模型通过预测单词来学习句子表示。另一方面,上下文模型包括一个邻居预测组件和一个正则化器,分别用于对分布和邻近假设进行建模。我们提出了一种在线算法来共同训练模型组件。我们在可获得上下文信息的设置中评估模型。在涉及句子的分类,聚类和排名的任务上的实验结果表明,我们的模型在多个数据集上的表现大大优于现有的最佳模型。与本章相关的代码位于:https://github.com/tksaha/con-s2v/tree/jointlearning与本章相关的数据位于:https://www.dropbox.com/sh/ruhsi3c0unn0nko/AAAgVnZpojvXx91oQ21WP_MYa Δdl= 0。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号