首页> 外文学位 >A Computational Framework for Question Processing in Community Question Answering Services.
【24h】

A Computational Framework for Question Processing in Community Question Answering Services.

机译:社区问答服务中问题处理的计算框架。

获取原文
获取原文并翻译 | 示例

摘要

Community Question Answering (CQA) services, such as Yahoo! Answers and Baidu Zhidao, provide a platform for a great number of users to ask and answer for their own needs. In recent years, the efficiency of CQA services for question solving and knowledge learning, however, is challenged by a sharp increase of questions raised in the communities. To facilitate answerers access to proper questions and help askers get information more efficiently, in this thesis we propose a computational framework for question processing in CQA services.;The framework consists of three components: popularity analysis and prediction, routing, and structuralization. The first component analyzes the factors affecting question popularity, and observes that the interaction of users and topics leads to the difference of question popularity. Based on the findings, we propose a mutual reinforcement-based label propagation algorithm to predict question popularity using features of question texts and asker profiles. Empirical results demonstrate that our algorithm is more effective in distinguishing high-popularity questions from low-popularity ones than other state-of-the-art baselines.;The second component aims to route new questions to potential answerers in CQA services. The proposed question routing (QR) framework considers both answerer expertise and answerer availability. To estimate answerer expertise, we propose three models. The first one is derived from the query likelihood language model, and the latter two models utilize the answer quality to refine the first model. To estimate answerer availability, we employ an autoregressive model. Experimental results demonstrate that leveraging answer quality can greatly improve the performance of QR. In addition, utilizing similar answerers' answer quality on similar questions provides more accurate expertise estimation and thus gives better QR performance. Moreover, answerer availability estimation further boosts the performance of QR.;Expertise estimation plays a key role in QR. However, current approaches employ full profiles to estimate all answerers' expertise, which is ineffective and time-consuming. To address this problem, we construct category-answerer indexes for filtering irrelevant answerers and develop category-sensitive language models for estimating answerer expertise. Experimental results show that: first, category-answerer indexes produce a much shorter list of relevant answerers to be routed, with computational costs substantially reduced; second, category-sensitive language models obtain more accurate expertise estimation relative to state-of-the-art baselines.;In the third component, we propose a novel hierarchical entitybased approach to structuralize questions in CQA services. Traditional list-based organization of questions is not effective for content browsing and knowledge learning due to large volume of documents. To address this problem, we utilize a large-scale entity repository, and construct a three-step framework to structuralize questions in "cluster entity trees (CETs)". Experimental results show the effectiveness of the framework in constructing CET. We further evaluate the performance of CET on knowledge organization from both user and system aspects. From a user aspect, our user study demonstrates that, with CET-based organization, users perform significantly better in knowledge learning than using list-based approach. From a system aspect, CET substantially boosts the performance on question search through re-ranking.;In summary, this thesis contributes both a conceptual framework and an empirical foundation to question processing in CQA services.
机译:社区问答(CQA)服务,例如Yahoo! Answers和百度之道,为大量用户提供了一个针对自己需求进行提问和回答的平台。近年来,社区中提出的问题急剧增加,挑战了解决问题和知识学习的CQA服务的效率。为了方便答题者访问适当的问题并帮助答题者更有效地获取信息,本文提出了一种用于CQA服务中问题处理的计算框架。该框架包括三个部分:流行性分析和预测,路由和结构化。第一部分分析影响问题受欢迎程度的因素,并观察到用户和主题的交互导致问题受欢迎程度的差异。基于这些发现,我们提出了一种基于互增的标签传播算法,以利用问题文本和问询者特征来预测问题的受欢迎程度。实证结果表明,与其他现有技术水平的基准相比,我们的算法在区分高人气问题和低人气问题方面更为有效。第二个组件旨在将新问题路由到CQA服务中的潜在应答者。提出的问题路由(QR)框架同时考虑了应答者的专业知识和应答者的可用性。为了估计答题者的专业知识,我们提出了三种模型。第一个是从查询似然语言模型派生而来的,后两个模型利用答案质量来完善第一个模型。为了估算答题器的可用性,我们采用了自回归模型。实验结果表明,利用答案质量可以大大提高QR的性能。此外,在类似问题上使用类似的答复者的回答质量可以提供更准确的专业知识估计,从而提供更好的QR性能。此外,应答者可用性估计进一步提高了QR的性能。专家估计在QR中起着关键作用。然而,当前的方法采用完整的概况来估计所有应答者的专业知识,这是无效且耗时的。为了解决这个问题,我们构造了类别-答题者索引以过滤不相关的答题者,并开发了类别敏感的语言模型来估计答题者的专业知识。实验结果表明:首先,类别应答器索引产生了要路由的相关应答器列表,该列表要短得多,大大降低了计算成本;第二,类别敏感的语言模型相对于最新的基准可以获得更准确的专业知识估计。在第三部分中,我们提出了一种新颖的基于层次实体的方法来结构化CQA服务中的问题。由于大量文档,传统的基于列表的问题组织对于内容浏览和知识学习无效。为了解决这个问题,我们利用了一个大型实体存储库,并构建了一个三步框架来结构化“集群实体树(CET)”中的问题。实验结果证明了该框架在构建英语四级考试中的有效性。我们从用户和系统两个方面进一步评估了CET在知识组织上的表现。从用户方面,我们的用户研究表明,基于CET的组织与基于列表的方法相比,用户在知识学习中的表现明显更好。从系统的角度来看,CET通过重新排名大大提高了问题搜索的性能。总而言之,本文为CQA服务中的问题处理提供了概念框架和经验基础。

著录项

  • 作者

    Li, Baichuan.;

  • 作者单位

    The Chinese University of Hong Kong (Hong Kong).;

  • 授予单位 The Chinese University of Hong Kong (Hong Kong).;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 181 p.
  • 总页数 181
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号