首页> 外文期刊>Decision support systems >An improvement in the quality of expert finding in community question answering networks
【24h】

An improvement in the quality of expert finding in community question answering networks

机译:社区问题接听网络中专家查找质量的提高

获取原文
获取原文并翻译 | 示例
       

摘要

Expert finding in Community Question Answering (CQA) networks such as Stack Overflow is a practical issue facing a challenging problem called vocabulary gap. A widely used approach to overcome this problem is translation model. Different from prior works that only consider the relevancy of translations to a query, we intend to diversify query translations for better coverage of query topics. In this work, we have utilized the idea of clustering to group relevant translations to a given query into different clusters and then select representatives from each cluster as a set of diverse translations. We have proposed two new approaches to cluster translations. In the first one, the Mutual Information was primarily utilized as a similarity measure during clustering. In the second approach, the relevant translations are embedded in a topic space and then clustered in that space. After clustering, we propose two batch and sequential methods to select a diverse set of translations from the resultant clusters. The batch method selects the top most relevant translations from each cluster proportional to the relevancy of that cluster to the user query. The sequential one is an iterative method that looks for the most diverse set of translations considering the previously selected ones. Finally, to rank users, a regression model was utilized to learn how expert and non-expert users differ in using a set of diverse translations in their documents. Experiments on a large dataset generated from Stack Overflow demonstrate that the proposed methods improve the ranking performance over baselines in the expert finding.
机译:专家在社区问题中查找(CQA)堆栈溢出等网络是一个实际问题,面临着名为词汇差距的具有挑战性的问题。广泛使用的方法来克服这个问题是翻译模型。与先前作品不同,只考虑翻译到查询的相关性,我们打算使查询翻译多样化以更好地覆盖查询主题。在这项工作中,我们利用群集群集将相关的翻译分组到给定查询到不同的集群中,然后选择每个群集的代表作为一组不同的翻译。我们提出了两种新的群集翻译方法。在第一个中,相互信息主要用于聚类期间的相似度。在第二种方法中,相关的翻译嵌入在主题空间中,然后群集在该空间中。群集后,我们提出了两种批处理和顺序方法,以从得到的群集中选择各种转换组。批处理方法从每个簇中选择与该群集的相关性与用户查询成比例的最相关的翻译。连续的是一种迭代方法,用于查找考虑先前选择的最多的翻译组。最后,为了对用户进行排名,利用回归模型来了解专家和非专家用户如何在文档中使用一组不同的翻译方式。在堆栈溢出产生的大型数据集上的实验表明,所提出的方法可以改善专家发现中基线的排名性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号