首页> 外文期刊>International Journal of Computational Science and Engineering >Term extraction and correlation analysis based on massive scientific and technical literature
【24h】

Term extraction and correlation analysis based on massive scientific and technical literature

机译:基于大规模科技文献的术语提取与相关分析

获取原文
获取原文并翻译 | 示例
           

摘要

Scientific and technical term is the basic unit of knowledge discovery and organisation construction. Correlation analysis is one of the important technologies for the deep data mining of massive, different scientific and technical literature. Based on the freely available digital library resources, this study adopts the technology of natural language processing to analyse the linguistics characteristics of terms, and combines with statistical analyses to extract the terms from scientific and technical literature. Using the results of term extraction, the paper proposes the algorithm of improved VSM towards correlation calculation for analysing different scientific and technical literature. According to the experimental results, it proposes a new way and possibility to automatically extract terms and realise correlation analysis for different literature from massive scientific and technical literature. Our method is superior to the method of unadopting linguistic rules and MI calculation. The accuracy of terms is about 73.5%. Compared with the traditional VSM based on terms, the correct rate of correlation calculation is increased by 12%.
机译:科学和技术术语是知识发现和组织建设的基本单位。相关性分析是大规模,不同的科技文献深入数据挖掘的重要技术之一。本研究基于自由可用的数字图书馆资源,采用自然语言处理技术来分析术语语言学特征,并结合统计分析,从科学和技术文献中提取条款。采用术语提取的结果,本文提出了改进VSM探讨了不同科学技术文献的相关计算的算法。根据实验结果,它提出了一种自动提取术语的新方式和可能性,并实现了来自大规模科技文献的不同文学的相关性分析。我们的方法优于取消构建语言规则和MI计算的方法。术语的准确性约为73.5%。与基于术语的传统VSM相比,正确的相关计算速率增加了12%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号