首页> 外文期刊>International Journal of Intelligent Systems >A novel corpus-based computing method for handling critical word-ranking issues: An example of COVID-19 research articles
【24h】

A novel corpus-based computing method for handling critical word-ranking issues: An example of COVID-19 research articles

机译:一种用于处理关键词排名问题的基于语料库的计算方法:Covid-19研究文章的示例

获取原文
获取原文并翻译 | 示例
       

摘要

A corpus is a massive body of structured textual data that are stored and operated electronically. It usually combines with statistics, machine learning algorithms, or artificial intelligence (AI) technologies to explore the semantic relationship between lexical units, and beneficial when applied to language learning, information processing, translation, and so forth. In the face of a novel disease, like, COVID-19, establishing medical-specific corpus will enhance frontline medical personnel's information acquisition efficiency, guiding them on the right approaches to respond to and prevent the novel disease. To effectively retrieve critical messages from the corpus, appropriately handling word-ranking issues is quite crucial. However, traditional frequency-based approaches may cause bias in handling word-ranking issues because they neither optimize the corpus nor integrally take words' frequency dispersion and concentration criteria into consideration. Thus, this paper develops a novel corpus-based approach that combines a corpus software and Hirsch index (H-index) algorithm to handle the aforementioned issues simultaneously, making word-ranking processes more accurate. This paper compiled 100 COVID-19-related research articles as an empirical example of the target corpus. To verify the proposed approach, this study compared the results of two traditional frequency-based approaches and the proposed approach. The results indicate that the proposed approach can refine corpus and simultaneously compute words' frequency dispersion and concentration criteria in handling word-ranking issues.
机译:语料库是指存储和操作电子结构化的文本数据的块状体。它通常与统计,机器学习算法,或人工智能(AI)技术相结合时,适用于语言学习,信息处理,翻译,等等,探讨词汇单位之间的语义关系,和有益的。在新的疾病的脸一样,COVID-19,建立具体的医疗全集将加强一线医务人员的信息获取效率,引导他们在正确的方法来应对和防止新的疾病。为了有效地从语料库检索关键信息,妥善处理词排名的问题是非常重要的。然而,传统的基于频率的方法可能会导致在处理字级别问题偏压,因为它们既不优化胼也不一体采取词语频率分散和浓度的标准考虑在内。因此,本文开发了新的基于语料库的方法,结合了语料库软件和赫希指数(H-指数)算法来同时处理上述问题,使得字排名过程更精确。本文编译100 COVID-19相关的研究文章作为目标语料库的实证例子。为了验证所提出的方法,该研究比较了两种传统的基于频率的方法和所提出的方法的结果。结果表明,所提出的方法可以细化语料库和同时计算单词频率分散和浓度条件在处理字级别的问题。

著录项

  • 来源
    《International Journal of Intelligent Systems》 |2021年第7期|3190-3216|共27页
  • 作者单位

    Department of Foreign Languages R.O.C. Military Academy Kaohsiung Taiwan Institute of Education National Sun Yat-sen University Kaohsiung Taiwan;

    Department of Management Sciences R.O.C. Military Academy Kaohsiung Taiwan Institute of Innovation and Circular Economy Asia University Taichung Taiwan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    corpus; COVID-19; H-index algorithm; machine learning;

    机译:语料库;新冠肺炎;H-索引算法;机器学习;
  • 入库时间 2022-08-19 01:59:49

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号