首页> 外文期刊>The Electronic Library >Domain-specific readability measures to improve information retrieval in the Persian language
【24h】

Domain-specific readability measures to improve information retrieval in the Persian language

机译:特定领域的可读性措施,以改善波斯语信息的检索

获取原文
获取原文并翻译 | 示例
       

摘要

Purpose - The degree to which a text is considered readable depends on the capability of the reader. This assumption puts different information retrieval systems at the risk of retrieving unreadable or hard-to-be-read yet relevant documents for their users. This paper aims to examine the potential use of concept-based readability measures along with classic measures for re-ranking search results in information retrieval systems, specifically in the Persian language. Design/methodology/approach - Flesch-Dayani as a classic readability measure along with document scope (DS) and document cohesion (DC) as domain-specific measures have been applied for scoring the retrieved documents from Google (181 documents) and the RICeST database (215 documents) in the field of computer science and information technology (IT). The re-ranked result has been compared with the ranking of potential users regarding their readability. Findings - The results show that there is a difference among subcategories of the computer science and IT field according to their readability and understandability. This study also shows that it is possible to develop a hybrid score based on DS and DC measures and, among all four applied scores in re-ranking the documents, the re-ranked list of documents based on the DSDC score shows correlation with re-ranking of the participants in both groups. Practical implications - The findings of this study would foster a new option in re-ranking search results based on their difficulty for experts and non-experts in different fields. Originality/value - The findings and the two-mode re-ranking model proposed in this paper along with its primary focus on domain-specific readability in the Persian language would help Web search engines and online databases in further refining the search results in pursuit of retrieving useful texts for users with differing expertise.
机译:目的-文本的可读性取决于阅读者的能力。这种假设使不同的信息检索系统面临为其用户检索不可读或难以阅读但相关的文档的风险。本文旨在研究基于概念的可读性度量以及经典度量在信息检索系统(尤其是波斯语)中对搜索结果进行重新排名的潜在用途。设计/方法/方法-Flesch-Dayani是一种经典的可读性度量,同时还具有文档范围(DS)和文档内聚力(DC)(作为特定领域的度量),用于对Google(181个文档)和RICeST数据库中检索到的文档进行评分(215个文件)在计算机科学和信息技术(IT)领域。将重新排名的结果与潜在用户在可读性方面的排名进行了比较。结果-结果显示,根据其可读性和可理解性,计算机科学和IT领域的子类别之间存在差异。这项研究还表明,有可能基于DS和D​​C度量得出混合分数,并且在对文档重新排名的所有四个应用得分中,基于DSDC得分的文档重新排名列表显示出与重新排名相关。两组参与者的排名。实际意义-这项研究的结果将为根据不同领域的专家和非专家的难度对搜索结果进行排名提供新的选择。原创性/价值-本文提出的调查结果和两模式重排模型以及其对波斯语领域特定可读性的主要关注,将有助于网络搜索引擎和在线数据库进一步完善搜索结果,以寻求实现为具有不同专业知识的用户检索有用的文本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号