首页> 外文期刊>Telematics and Informatics >Important citation identification using sentiment analysis of in-text citations
【24h】

Important citation identification using sentiment analysis of in-text citations

机译:重要的引文识别使用文中文本的情感分析

获取原文
获取原文并翻译 | 示例
       

摘要

Citation represents the relationship between the cited and the citing document and vice versa. Citations are widely used to measure the different aspects of knowledge-based achievements such as institutional ranking, author ranking, the impact factor of the journal, research grants, and peer judgments. A fair evaluation of research required a quantitative and qualitative assessment of citations. To perform the qualitative analysis of citations, researchers tried to classify the citations into binary classes (i.e., important and non-important). To perform this task, researchers used metadata, content, citations count, cue words or phrases, sentiment analysis, keywords, and machine learning approaches for citation classification. However, the state-of-the-art results of binary classification are inadequate for the calculation of different aspects of the researcher and their work. Therefore, this research proposed an in-text citation sentiment analysis-based approach for binary classification which effectively enhanced the results of the state-of-the-art. In this research, different machine learning-based models are evaluated to determine the in-text citations sentiments. These sentiment results are further used for positive-negative, and neutral citation counts. Furthermore, the scores of cosine similarity between paper citation pairs are also calculated and used as a feature. This sentiment and cosine similarity scores are further used as features in binary classification. The classification is performed through SVM, KLR, and Random Forest. The proposed approach is evaluated and compared with two state-of-the-art approaches on the benchmark dataset. The proposed approach can achieve 0.83 f-measure with the improvement of 13.6% for dataset 1 and 0.67 with an improvement of 8% for dataset two with a random forest classification model.
机译:引文代表了所引用和引用文件之间的关系,反之亦然。引文被广泛用于衡量知识的成就的不同方面,如机构排名,作者排名,期刊的影响因素,研究补助和同行判断。对研究的公平评估需要对引用的定量和定性评估。为了执行对引用的定性分析,研究人员试图将引用分为二元课程(即,重要而非重要性)。要执行此任务,研究人员使用了引文分类的元数据,内容,引文计数,提示词或短语,情感分析,关键字和机器学习方法。然而,二进制分类的最先进结果对于计算研究人员的不同方面以及其工作的计算不足。因此,本研究提出了一种基于文本引文的思想分析,用于二进制分类的方法,有效增强了最先进的结果。在这项研究中,评估了不同的基于机器学习的模型,以确定文本文本情绪。这些情绪结果进一步用于正阴性和中性引文计数。此外,还计算纸引文对之间的余弦相似性并用作特征。这种情绪和余弦相似度得分进一步用作二进制分类中的特征。分类是通过SVM,KLR和随机林进行的。评估所提出的方法,并将其与基准数据集上的两个最先进的方法进行比较。该方法可以实现0.83 F测量,随着随机森林分类模型,DataSet 1和0.67的提高,对数据集1和0.67的提高,改善了DataSet 2的8%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号