首页> 外文期刊>Future generation computer systems >QAOC: Novel query analysis and ontology-based clustering for data management in Hadoop
【24h】

QAOC: Novel query analysis and ontology-based clustering for data management in Hadoop

机译:QAOC:用于Hadoop中数据管理的新颖查询分析和基于本体的集群

获取原文
获取原文并翻译 | 示例

摘要

Bottleneck issues handled in the field of information retrieval are analysis of query and management of data storage. Hadoop is a large scale environment that is supported with larger storage and faster processing. Even though, it suffers from these challenging issues while the number of information requesters is higher. This paper addresses these two bottleneck issues in Hadoop by retrieving the information with the design of Query Analysis and Ontology-based Clustering (QAOC) architecture. In QAOC architecture, the components involved are query manager, scheduler and data management. Initially the query manager consolidates the query if they are similar; hereby the searching time is effectively minimized. Then the user query is scheduled in neuro-fuzzy by computing query arrival time, query length and query expiry time. The data management in the back-end is operated by weighted ontology-based clustering method to cluster the data based on their relevancy. The scheduled user query is searched in the ontology based balanced binary tree and lastly the relevant results are ranked using Okapi BM25 and delivered to user. This QAOC architecture is experimented on Hadoop 2.7 and the results are compared in terms of execution time, processing speed and memory consumption.
机译:信息检索领域中处理的瓶颈问题是查询分析和数据存储管理。 Hadoop是一个大型环境,具有更大的存储空间和更快的处理速度。即使,当信息请求者的数量增加时,它也遭受了这些挑战性问题的困扰。本文通过使用查询分析和基于本体的集群(QAOC)架构的设计来检索信息,从而解决了Hadoop中的两个瓶颈问题。在QAOC体系结构中,涉及的组件是查询管理器,调度程序和数据管理。最初,如果查询管理器相似,则将其合并。由此有效地使搜索时间最小化。然后,通过计算查询到达时间,查询长度和查询到期时间来对用户查询进行神经模糊调度。后端中的数据管理通过基于加权本体的聚类方法进行操作,以基于数据的相关性对数据进行聚类。在基于本体的平衡二叉树中搜索计划的用户查询,最后使用Okapi BM25对相关结果进行排名并交付给用户。此QAOC架构在Hadoop 2.7上进行了实验,并在执行时间,处理速度和内存消耗方面比较了结果。

著录项

  • 来源
    《Future generation computer systems》 |2020年第7期|849-860|共12页
  • 作者

    D. Pradeep; C. Sundar;

  • 作者单位

    Department of Information Technology Christian College of Engineering and Technology Oddanchatram Tamil Nadu 624619 India;

    Department of Computer Science and Engineering Christian College of Engineering and Technology Oddanchatram Tamil Nadu 624619 India;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Hadoop; Data management; Pre-processing; Scheduling;

    机译:Hadoop;数据管理;预处理;排程;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号