首页> 外文期刊>ACM Transactions on Information Systems >Large-Scale Question Tagging via Joint Question-Topic Embedding Learning
【24h】

Large-Scale Question Tagging via Joint Question-Topic Embedding Learning

机译:通过联合问题-主题嵌入学习进行大规模问题标记

获取原文
获取原文并翻译 | 示例
           

摘要

Recent years have witnessed a flourishing of community-driven question answering (cQA), like Yahoo! Answers and AnswerBag, where people can seek precise information. After 2010, some novel cQA systems, including Quora and Zhihu, gained momentum. Besides interactions, the latter enables users to label the questions with topic tags that highlight the key points conveyed in the questions. In this article, we shed light on automatically annotating a newly posted question with topic tags that are predefined and preorganized into a directed acyclic graph. To accomplish this task, we present an end-to-end deep interactive embedding model to jointly learn the embeddings of questions and topics by projecting them into the same space for a similarity measure. In particular, we first learn the embeddings of questions and topic tags by two deep parallel models. Thereinto, we regularize the embeddings of topic tags via fully exploring their hierarchical structures, which is able to alleviate the problem of imbalanced topic distribution. Thereafter, we interact each question embedding with the topic tag matrix, i.e., all the topic tag embeddings. Following that, a sigmoid cross-entropy loss is appended to reward the positive question-topic pairs and penalize the negative ones. To justify our model, we have conducted extensive experiments on an unprecedented large-scale social QA dataset obtained from Zhihu.com, and the experimental results demonstrate that our model achieves superior performance to several state-of-the-art baselines.
机译:近年来,像Yahoo!这样的社区驱动型问答(cQA)蓬勃发展。答案和答案袋,人们可以在其中找到精确的信息。 2010年之后,包括Quora和Zhihu在内的一些新颖的cQA系统得到了发展。除了交互之外,后者还使用户能够使用主题标签来标记问题,主题标签突出显示问题中传达的关键点。在本文中,我们介绍了如何使用预先定义并预先组织为有向无环图的主题标签自动注释新发布的问题。为了完成此任务,我们提出了一种端到端的深度交互式嵌入模型,通过将问题和主题投影到同一空间以进行相似性度量,从而共同学习问题和主题的嵌入。特别是,我们首先通过两个深度并行模型学习问题和主题标签的嵌入。其中,我们通过充分探索主题标签的层次结构来规范其嵌入,从而缓解了主题分布不均衡的问题。此后,我们将每个问题嵌入与主题标签矩阵(即所有主题标签嵌入)进行交互。之后,附加一个S型交叉熵损失来奖励正面问题对,并惩罚负面问题对。为了证明我们的模型是正确的,我们对从Zhihu.com获得的前所未有的大规模社交QA数据集进行了广泛的实验,实验结果表明,我们的模型相对于多个最新基准具有优异的性能。

著录项

  • 来源
    《ACM Transactions on Information Systems》 |2020年第2期|20.1-20.23|共23页
  • 作者

  • 作者单位

    Shandong Univ Tsingtao Campus 72 Binhai Rd Jimo Qingdao 266237 Shandong Peoples R China;

    Natl Univ Singapore Singapore Singapore;

    Hefei Univ Technol Hefei Anhui Peoples R China;

    Qilu Univ Technol Shandong Acad Sci Jinan Shandong Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Question tagging; topic hierarchy; CQA; embedding learning;

    机译:问题标记;主题层次;CQA;嵌入学习;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号