首页> 外文会议>Trends and applications in knowledge discovery and data mining >Identifying Authoritative and Reliable Contents in Community Question Answering with Domain Knowledge
【24h】

Identifying Authoritative and Reliable Contents in Community Question Answering with Domain Knowledge

机译:识别具有领域知识的社区问答中的权威性和可靠内容

获取原文
获取原文并翻译 | 示例

摘要

Community Question Answering (CQA) has emerged as a popular forum for users to ask and answer questions. Over the last few years, CQA portals such as Yahoo answers and Baidu Zhidao have exploded in popularity, and now provide a viable alternative to general purpose Web search. A number of answers submitted to address questions on CQA sites compose a valuable knowledge repository, which could be a gold mine for information retrieval as well as text mining. Two important questions in CQA research are focused on the quality of contents and the reputation of the answerers. Previous approaches for retrieving relevant and high quality content have been proposed, but not much work has been done on providing an integrated framework to solve these two problems. Besides, no research work has used both text and link information in their methods via leveraging existing ratings of answers and questions. In this paper, we present a novel approach to analyze questions and answers based on the topic modeling framework with Dirichlet forest priors (LDA-DF). We utilize information obtained from LDA-DF to construct a joint topical and link model to identify authorities and reliable answers on a CQA site.We evaluate our methods in a dataset obtained from Yahoo! Answers. With the new representation of topical structures on CQA datasets, using a limited amount of web resource, we show significant improvements over the state-of-art methods LDA-DF, LDA, and HLDA on performance of authority identification and answer ranking.
机译:社区问题解答(CQA)已经成为用户提问和回答问题的热门论坛。在过去的几年中,CQA门户(例如Yahoo答案和百度之道)已迅速普及,现在提供了通用Web搜索的可行替代方案。为解决CQA网站上的问题而提交的许多答案构成了一个有价值的知识库,它可能是信息检索和文本挖掘的金矿。 CQA研究中的两个重要问题集中在内容的质量和答复者的声誉上。已经提出了用于检索相关和高质量内容的先前方法,但是在提供用于解决这两个问题的集成框架上的工作还很少。此外,还没有研究工作利用现有的答案和问题等级在其方法中同时使用文本和链接信息。在本文中,我们提出了一种基于Dirichlet森林先验(LDA-DF)的主题建模框架来分析问题和答案的新颖方法。我们利用从LDA-DF获得的信息来构建联合主题和链接模型,以识别CQA网站上的权限和可靠答案。我们在从Yahoo!答案。通过使用有限的Web资源,在CQA数据集上使用主题结构的新表示形式,我们在权限识别和答案排序的性能方面显示了对LDA-DF,LDA和HLDA的最新方法的显着改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号