首页> 外文会议>International conference on recent advances in natural language processing >MappSent: a Textual Mapping Approach for Question-to-Question Similarity
【24h】

MappSent: a Textual Mapping Approach for Question-to-Question Similarity

机译:mappsent:一种质询映射方法,用于质疑相似性

获取原文

摘要

Since the advent of word embedding methods, the representation of longer pieces of texts such as sentences and paragraphs is gaining more and more interest, especially for textual similarity tasks. Mikolov et al. (2013a) have demonstrated that words and phrases exhibit linear structures that allow to meaningfully combine words by an element-wise addition of their vector representations. Recently, Arora et al. (2017) have shown that removing the projections of the weighted average sum of word embedding vectors on their first principal components, outperforms sophisticated supervised methods including RNN's and LSTM's. Inspired by Mikolov et al. (2013a); Arora et al. (2017) findings and by a bilingual word mapping technique presented in Artetxe et al. (2016), we introduce MappSent, a novel approach for textual similarity. Based on a linear sentence embedding representation, its principle is to build a matrix that maps sentences in a joint-subspace where similar sets of sentences are pushed closer. We evaluate our approach on the SemEval 2016/2017 question-to-question similarity task and show that overall MappSent achieves competitive results and outperforms in most cases state-of-art methods.
机译:由于Word嵌入方法的出现,诸如句子和段落等较长部分的表示越来越兴趣,特别是对于文本相似性任务。 Mikolov等人。 (2013A)已经展示了单词和短语表现出线性结构,其允许通过元素和传统的矢量表示的元素添加有意义的单词。最近,Arora等人。 (2017)表明,删除了在其第一个主成分上嵌入嵌入向量的加权平均总和的预测,优于具有RNN和LSTM的复杂的监督方法。灵感来自Mikolov等人。 (2013A); Arora等人。 (2017)在Artetxe等人中提出的调查结果和双语单句映射技术。 (2016),我们介绍了Mappsent,一种新的文本相似性方法。基于线性句子嵌入表示,其原则是构建一个矩阵,将句子映射在一个相似的句子上越来越近的句子中的句子。我们在2016/2017的Semeval上评估了我们的方法,并表明整体Mappsent在大多数情况下在大多数情况下实现了竞争性结果和胜过。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号