首页> 外文会议>2010 Proceedings of Technology Management for Global Economic Growth >How to measure the semantic similarities between scientific papers and patents in order to discover uncommercialized research fronts: A case study of solar cells
【24h】

How to measure the semantic similarities between scientific papers and patents in order to discover uncommercialized research fronts: A case study of solar cells

机译:如何测量科学论文与专利之间的语义相似性,以发现非商业化的研究前沿:以太阳能电池为例

获取原文

摘要

In this paper, we perform a comparative study to measure the semantic similarity between academic papers and patents. Research fronts which do not correspond any patents can be uncommercialized and opportunities for industry. Therefore it is significant to investigate the relationship between the scientific outcomes and the pieces of industrial technology. We compare structures of citation network of scientific publications with those of patents by citation analysis, measure the similarity between sets of academic papers and ones of patents by natural language processing, and discuss the validity of the results with experts. After the documents (papers/patents) in each layer are categorized by a citation-based method, we compare three semantic similarity measurements between a set of academic papers and a set of patents: Jaccard coefficient, cosine similarity of tfidf vector, and cosine similarity of log-tfidf vector. A case study is performed in solar cells to develop a method investigating the corresponding relationship between papers and patents. As a result, the cosine similarity of tfidf is the best way to discover the corresponding relationship. This proposed approach enables us to obtain, at least, the candidates of unexplored research fronts, where academic researches exist but patents do not.
机译:在本文中,我们进行了一项比较研究,以衡量学术论文和专利之间的语义相似性。与任何专利都不对应的研究领域可能会被商品化,并为工业带来机遇。因此,研究科学成果与工业技术要素之间的关系具有重要意义。我们通过引文分析比较科学出版物和专利的引文网络的结构,通过自然语言处理来衡量学术论文集与专利之间的相似性,并与专家讨论结果的有效性。在通过基于引用的方法对每一层中的文档(论文/专利)进行分类之后,我们比较了一组学术论文和一组专利之间的三种语义相似性度量:Jaccard系数,tfidf向量的余弦相似度和余弦相似度-tfidf向量集。对太阳能电池进行了案例研究,以开发一种研究论文与专利之间的对应关系的方法。结果,tfidf的余弦相似度是发现对应关系的最佳方法。这种提议的方法使我们至少能够获得存在未开发研究领域的候选人,这些研究领域存在学术研究而专利却没有。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号