首页> 外文会议>International Conference on Intelligent Computing and Signal Processing >Deep Learning Based Search and Recommendation of Literature in Wanfang Knowledge Base
【24h】

Deep Learning Based Search and Recommendation of Literature in Wanfang Knowledge Base

机译:万芳知识库文学的深度学习搜索与推荐

获取原文
获取外文期刊封面目录资料

摘要

Classifying Chinese text is a hot but difficult topic in natural language processing field. In this work, we aim to quickly recommend the required documents for users via searching keywords in a massive literature database. We use "Wanfang Data Knowledge Service Platform Journal Document User Behavior Log Data" as the object to research and build a text classifier based on deep learning techniques. It is proposed to perform data cleaning and data extraction for user browsing behavior logs and user download behavior logs. Firstly, Chinese word segmentation and keyword extraction are carried out on the document title information, and the keyword data set and document keyword data set based on the document name are constructed. Finally, using the traditional deep learning model and convolutional neural network model to build a text classifier for training and classification, as a model for users to search for keywords recommended for literature. Experimental results show that the constructed model is able to effectively classify and recommend documents with the users' search keywords, and extract keywords from document names in the construction model. The proposed model TEXTCNN V2 obtains a value for accuracy, precision and F-1 score is 0.9132, 0.9220 and 0.9154, respectively.
机译:分类中文文本是自然语言处理领域的热门但困难的话题。在这项工作中,我们的目标是通过在大规模文献数据库中搜索关键字来快速推荐用户的所需文档。我们使用“万方数据知识服务平台日记帐文档用户行为日志数据”作为对象研究和构建基于深度学习技术的文本分类器。建议为用户浏览行为日志和用户下载行为日志执行数据清洁和数据提取。首先,中文单词分割和关键字提取在文档标题信息上执行,构建了基于文档名称的关键字数据集和文档关键字数据集。最后,使用传统的深度学习模型和卷积神经网络模型来构建文本分类器进行培训和分类,作为用户搜索推荐文献的关键字的模型。实验结果表明,构造模型能够通过用户的搜索关键字有效地分类和推荐文档,并从构造模型中提取来自文档名称的关键字。所提出的型号TextCNN V2分别获得精度,精度,F-1分别为0.9132,0.9220和0.9154的值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号