首页> 外文会议>International workshop on semantic evaluation >SERGIOJIMENEZ at SemEval-2016 Task 1: Effectively Combining Paraphrase Database, String Matching, WordNet and Word Embedding for Semantic Textual Similarity
【24h】

SERGIOJIMENEZ at SemEval-2016 Task 1: Effectively Combining Paraphrase Database, String Matching, WordNet and Word Embedding for Semantic Textual Similarity

机译:Semeval-2016的Sergiojimenez任务1:有效地组合释放数据库,字符串匹配,Wordnet和Word嵌入语义文本相似性

获取原文

摘要

In this paper, a system for semantic textual similarity, which participated in Task-1 in SemEval 2016 (monolingual and cross-lingual sub-tasks) is described. The system contains a preprocessing step that simplifies text using PPDB 2.0 and detects negations. Also, six lexical similarity functions were constructed using string matching, word embedding and synonyms-antonyms relations in WordNet. These lexical similarity functions are projected to sentence level using a new method called Polarized Soft Cardinality that supports negative similarities between words to model opposites. We also introduce a novel L~2-norm "cardinality" for vector space representations. The system extracts a set of 660 features from each pair of text snippets using the proposed cardinality measures. From this set, a subset of 12 features was selected in a supervised manner. These features are combined by SVR and, alternatively, by using the arithmetic mean to produce similarity predictions. Our team ranked second in the cross-lingual sub-task and got close to the best official results in the monolingual sub-task.
机译:在本文中,描述了参与Semeval 2016中的任务-1的语义文本相似性的系统(单晶和交叉语言子任务)。该系统包含一个预处理步骤,使用PPDB 2.0简化文本并检测否定。此外,使用字符串匹配,Word Embedding和Wordnet中的同义词 - 反义词构建六个词汇相似函数。这些词汇相似函数使用称为偏振软基数的新方法投射到句子级别,该方法支持在单词之间的负面相似之处到模拟对立之间。我们还介绍了用于矢量空间表示的新型L〜2常态“基数”。系统使用所提出的基数测量从每对文本片段提取一组660个功能。从此设置,以监督方式选择12个功能的子集。这些特征通过SVR组合,并且通过使用算术均值来产生相似性预测的算术。我们的团队在交叉语言子任务中排名第二,并接近单声道子任务中的最佳官方结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号