Important citation identification using sentiment analysis of in-text citations

Aljuaid Hanan; Iftikhar Rimsha; Ahmad Shahbaz; Asif Muhammad; Afzal Muhammad Tanvir

首页> 外文期刊>Telematics and Informatics >Important citation identification using sentiment analysis of in-text citations

【24h】

Important citation identification using sentiment analysis of in-text citations

机译：重要的引文识别使用文中文本的情感分析

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Citation represents the relationship between the cited and the citing document and vice versa. Citations are widely used to measure the different aspects of knowledge-based achievements such as institutional ranking, author ranking, the impact factor of the journal, research grants, and peer judgments. A fair evaluation of research required a quantitative and qualitative assessment of citations. To perform the qualitative analysis of citations, researchers tried to classify the citations into binary classes (i.e., important and non-important). To perform this task, researchers used metadata, content, citations count, cue words or phrases, sentiment analysis, keywords, and machine learning approaches for citation classification. However, the state-of-the-art results of binary classification are inadequate for the calculation of different aspects of the researcher and their work. Therefore, this research proposed an in-text citation sentiment analysis-based approach for binary classification which effectively enhanced the results of the state-of-the-art. In this research, different machine learning-based models are evaluated to determine the in-text citations sentiments. These sentiment results are further used for positive-negative, and neutral citation counts. Furthermore, the scores of cosine similarity between paper citation pairs are also calculated and used as a feature. This sentiment and cosine similarity scores are further used as features in binary classification. The classification is performed through SVM, KLR, and Random Forest. The proposed approach is evaluated and compared with two state-of-the-art approaches on the benchmark dataset. The proposed approach can achieve 0.83 f-measure with the improvement of 13.6% for dataset 1 and 0.67 with an improvement of 8% for dataset two with a random forest classification model.

机译：引文代表了所引用和引用文件之间的关系，反之亦然。引文被广泛用于衡量知识的成就的不同方面，如机构排名，作者排名，期刊的影响因素，研究补助和同行判断。对研究的公平评估需要对引用的定量和定性评估。为了执行对引用的定性分析，研究人员试图将引用分为二元课程（即，重要而非重要性）。要执行此任务，研究人员使用了引文分类的元数据，内容，引文计数，提示词或短语，情感分析，关键字和机器学习方法。然而，二进制分类的最先进结果对于计算研究人员的不同方面以及其工作的计算不足。因此，本研究提出了一种基于文本引文的思想分析，用于二进制分类的方法，有效增强了最先进的结果。在这项研究中，评估了不同的基于机器学习的模型，以确定文本文本情绪。这些情绪结果进一步用于正阴性和中性引文计数。此外，还计算纸引文对之间的余弦相似性并用作特征。这种情绪和余弦相似度得分进一步用作二进制分类中的特征。分类是通过SVM，KLR和随机林进行的。评估所提出的方法，并将其与基准数据集上的两个最先进的方法进行比较。该方法可以实现0.83 F测量，随着随机森林分类模型，DataSet 1和0.67的提高，对数据集1和0.67的提高，改善了DataSet 2的8％。

著录项

来源
《Telematics and Informatics》 |2021年第1期|101492.1-101492.16|共16页
作者
Aljuaid Hanan; Iftikhar Rimsha; Ahmad Shahbaz; Asif Muhammad; Afzal Muhammad Tanvir;
展开▼
作者单位

Princess Nourah Bint Abdulrahman Univ Coll Comp & Informat Sci Dept Comp Sci Riyadh Saudi Arabia;

Natl Text Univ Dept Comp Sci Faisalabad Pakistan;

Natl Text Univ Dept Comp Sci Faisalabad Pakistan;

Natl Text Univ Dept Comp Sci Faisalabad Pakistan;

Namal Inst Dept Comp Sci Mianwali Pakistan;

展开▼
收录信息美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Sentiment analysis; Cosine similarity; In-text citation; Linear SVC; Multinomial Naive Bayes; KNN; Logistic regression; Bernoulli NB; Citation classification;

机译：情绪分析;余弦相似;文本引文;线性SVC;多项式幼稚贝叶斯;knn;逻辑回归;伯努利NB;引文分类;

相似文献

外文文献
中文文献
专利

1. Important citation identification by exploiting content and section-wise in-text citation count [J] . Shahzad Nazir, Muhammad Asif, Shahbaz Ahmad, PLoS One . 2020,第3期

机译：利用内容和文本文本引文计数的重要引文识别
2. Order matters: Alphabetizing in-text citations biases citation rates [J] . Stevens Jeffrey R., Duque Juan F. Psychonomic bulletin & review . 2019,第3期

机译：订单事项：字母顺序化文本引文偏见引用率
3. Dimensions and Uncertainties of Author Citation Rankings: Lessons Learned From Frequency-Weighted In-Text Citation Counting [J] . Dangzhi Zhao, Andreas Strotmann Journal of the American Society for Information Science and Technology . 2016,第3期

机译：作者引文排名的维度和不确定性：频率加权文字引文计数的经验教训
4. Important Citation Identification by Exploiting the Optimal In-text Citation Frequency [C] . Shahzad Nazir, Muhammad Asif, Shahbaz Ahmad International Conference on Engineering and Emerging Technologies . 2020

机译：通过利用最佳的文本引用频率来识别重要的引用
5. A quantitative content analysis of in-text citations in choral pedagogy books published between 1989–2009 [D] . Jones, Sarah K. 2010

机译：1989 - 2009年间公布的合唱教学图书中文本文本的定量内容分析
6. Important citation identification by exploiting content and section-wise in-text citation count [O] . Shahzad Nazir, Muhammad Asif, Shahbaz Ahmad, 2020

机译：通过利用内容和文本文本引文计数的重要引文识别
7. Important citation identification by exploiting content and section-wise in-text citation count [O] . Shahzad Nazir, Muhammad Asif, Shahbaz Ahmad, 2020

机译：通过利用内容和文本文本引文计数的重要引文识别
8. Science and Technology Citation Analysis is Citation Normalization Realistic [R] . Kostoff, R. N. , Martinez, W. L. 2004

机译：科技引文分析是引文归一化的现实主义

Important citation identification using sentiment analysis of in-text citations

摘要

著录项

相似文献

相关主题

期刊订阅