Modeling and Learning Distributed Word Representation with Metadata for Question Retrieval

Guangyou Zhou; Jimmy Xiangji Huang

首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Modeling and Learning Distributed Word Representation with Metadata for Question Retrieval

【24h】

Modeling and Learning Distributed Word Representation with Metadata for Question Retrieval

机译：利用元数据建模和学习分布式单词表示以进行问题检索

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Community question answering (cQA) has become an important issue due to the popularity of cQA archives on the Web. This paper focuses on addressing the lexical gap problem in question retrieval. Question retrieval in cQA archives aims to find the existing questions that are semantically equivalent or relevant to the queried questions. However, the lexical gap problem brings a new challenge for question retrieval in cQA. In this paper, we propose to model and learn distributed word representations with metadata of category information within cQA pages for question retrieval using two novel category powered models. One is a basic category powered model called MB-NET and the other one is an enhanced category powered model called ME-NET which can better learn the distributed word representations and alleviate the lexical gap problem. To deal with the variable size of word representation vectors, we employ the framework of fisher kernel to transform them into the fixed-length vectors. Experimental results on large-scale English and Chinese cQA data sets show that our proposed approaches can significantly outperform state-of-the-art retrieval models for question retrieval in cQA. Moreover, we further conduct our approaches on large-scale automatic evaluation experiments. The evaluation results show that promising and significant performance improvements can be achieved.

机译：由于cQA档案在网络上的普及，社区问答（cQA）已成为一个重要问题。本文着重解决问题检索中的词汇空缺问题。 cQA档案中的问题检索旨在查找在语义上等效或与所查询问题相关的现有问题。然而，词汇间隙问题给cQA中的问题检索带来了新的挑战。在本文中，我们建议使用两个新颖的类别驱动模型对cQA页面中的类别信息元数据进行建模和学习，以利用类别信息的元数据进行问题检索。一个是称为MB-NET的基本类别支持的模型，另一个是称为ME-NET的增强类别支持的模型，该模型可以更好地学习分布式单词表示形式并减轻词汇间隙问题。为了处理单词表示向量的可变大小，我们采用了费舍尔内核的框架将它们转换为定长向量。在大型英语和中文cQA数据集上的实验结果表明，我们提出的方法可以大大优于cQA中用于问题检索的最新检索模型。此外，我们在大规模自动评估实验中进一步进行了研究。评估结果表明，可以实现有希望的重大性能改进。

著录项

来源
《IEEE Transactions on Knowledge and Data Engineering》 |2017年第6期|1226-1239|共14页
作者
Guangyou Zhou; Jimmy Xiangji Huang;
展开▼
作者单位

School of Computer, Central China Normal University, Wuhan, Hubei, China;

School of Information Technology, York University, Toronto, ON, Canada;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Metadata; Semantics; Context modeling; Knowledge discovery; Computational modeling; Kernel; Aggregates;

机译：元数据;语义;上下文建模;知识发现;计算建模;内核;聚集;

相似文献

外文文献
中文文献
专利

1. Learning bag-of-embedded-words representations for textual information retrieval [J] . Passalis Nikolaos, Tefas Anastasios Pattern Recognition: The Journal of the Pattern Recognition Society . 2018,第期

机译：学习文本信息检索的嵌入文字表示
2. Structured metadata for representation, query and retrieval of behaviour models of virtual products [J] . T.-L. SUN International Journal of Computer Integrated Manufacturing . 2007,第6期

机译：用于表示，查询和检索虚拟产品行为模型的结构化元数据
3. Comparison of Deep-Neural-Network-Based Models for Estimating Distributed Representations of Compound Words [J] . An Dao, Natthawut Kertkeidkachorn, Ryutaro Ichise Procedia Computer Science . 2021,第a期

机译：基于深神经网络的模型估算复合词分布式表示的比较
4. Learning Continuous Word Embedding with Metadata for Question Retrieval in Community Question Answering [C] . Guangyou Zhou, Tingting He, Jun Zhao, Annual meeting of the Association for Computational Linguistics;International joint conference on natural language processing of the Asian Federation of Natural Languages processing . 2015

机译：学习带有元数据的连续单词嵌入以在社区问答中检索问题
5. An Analysis of Bottom-Up Attention Models and Multimodal Representation Learning for Visual Question Answering [D] . Narayanan, Venkatraman . 2019

机译：视觉问题应答的自下而上关注模型和多式联表学习分析
6. An effective content-based image retrieval technique for image visuals representation based on the bag-of-visual-words model [O] . Safia Jabeen, Zahid Mehmood, Toqeer Mahmood, -1

机译：基于视觉袋模型的基于内容的有效图像检索技术
7. Are words enough? A study on text-based representations and retrieval models for linking pins to online shops [O] . Zoghbi Susana, Vulic Ivan, Moens Marie-Francine 2013

机译：话够了吗？基于文本的表示形式和将销钉链接到网上商店的检索模型的研究
8. Context as the Building Blocks of Meaning: A Retrieval Model for the Semantic Representation of Words. [R] . Kwantes, P. J. 2003

机译：作为意义构建块的语境：词语语义表征的检索模型。

Modeling and Learning Distributed Word Representation with Metadata for Question Retrieval

摘要

著录项

相似文献

相关主题

期刊订阅