首页> 外文会议>IEEE International Conference on Big Data Computing Service and Applications >ScholarFinder: Knowledge Embedding Based Recommendations using a Deep Generative Model
【24h】

ScholarFinder: Knowledge Embedding Based Recommendations using a Deep Generative Model

机译:ScholarFinder:使用深度生成模型的基于知识嵌入的推荐

获取原文

摘要

Bold scientific research tasks today need multi-disciplinary knowledge and interdisciplinary collaborations that require finding scholars from a particular domain with relevant knowledge. Given the variety of scholars and diversity of research tasks, finding the appropriate scholar is a critically important and challenging problem for scientific communities. Prior approaches to identify scholars use supervised learning with fixed or high-level research interest tags. Such approaches make it hard to recognize scholars with specific interests, or track their changes in research interests. Hence, there is a need to investigate suitable methods to quantify scholars' expertise knowledge for matching research tasks. In this paper, we propose a novel model viz., "ScholarFinder" that uses contextual information (abstracts or publications) for embedding a scholar's knowledge in an unsupervised learning manner. Subsequently, with pre-trained knowledge embeddings, we can perform machine learning tasks such as classification, visualization or checking whether a scholar is suitable for performing particular research tasks or not. Based on our pre-trained techniques, we also provide a novel negative sampling method to overcome the issues of missing negative samples. Using a "follow-the-money" strategy, we apply our model to a large collection of NSF (National Science Foundation) grant awards dataset collected over the last twenty years that contains more than 20,000 award records (with project abstracts and names), corresponding to 15,074 scholars who received grants. We evaluate different deep learning models to see how to use pre-trained knowledge embedding for achieving optimal performance, and how our negative sampling method improves model performance. We also compare our model with state-of-the-art baseline models (e.g., XGBoost, DNN), and our results show that the ScholarFinder model outperforms those models in terms of precision, recall, F1-score, and accuracy.
机译:如今,大胆的科学研究任务需要多学科的知识和跨学科的合作,这需要从特定领域中寻找具有相关知识的学者。考虑到学者的多样性和研究任务的多样性,对于科学界来说,找到合适的学者是至关重要且具有挑战性的问题。用来识别学者的先前方法是使用具有固定或高级研究兴趣标签的监督学习。这样的方法使得难以识别具有特定兴趣的学者或追踪他们在研究兴趣方面的变化。因此,需要研究合适的方法来量化学者的专业知识,以匹配研究任务。在本文中,我们提出了一种新颖的模型,即“ ScholarFinder”,该模型使用上下文信息(摘要或出版物)以无监督的学习方式嵌入学者的知识。随后,通过预先训练的知识嵌入,我们可以执行机器学习任务,例如分类,可视化或检查学者是否适合执行特定的研究任务。基于我们的预训练技术,我们还提供了一种新颖的负采样方法,以克服缺少负采样的问题。我们采用“跟随金钱”策略,将我们的模型应用于过去20年收集的大量NSF(国家科学基金会)资助奖项数据集,其中包含20,000多个资助记录(以及项目摘要和名称),对应于获得资助的15074名学者。我们评估了不同的深度学习模型,以了解如何使用预训练的知识嵌入来实现最佳性能,以及我们的负采样方法如何提高模型性能。我们还将模型与最新的基准模型(例如XGBoost,DNN)进行了比较,结果表明,在精确度,查全率,F1得分和准确性方面,ScholarFinder模型的性能优于那些模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号