...
首页> 外文期刊>Open Journal of Philosophy >Empirical Research on Web Harvesting in the Process of Text and Data Mining in National Libraries of EU Member States
【24h】

Empirical Research on Web Harvesting in the Process of Text and Data Mining in National Libraries of EU Member States

机译:欧盟国家图书馆文本和数据挖掘过程中的媒体收割实证研究

获取原文
           

摘要

Almost two decades of experience on web harvesting and archiving are counted; the subject of web harvesting and web archiving have been top in the interest of researchers, technologists and librarians-information scientists. Web harvesting projects and pilot programs on archiving content traced on the Web are becoming priorities for national libraries and cultural heritage organizations in the EU. This paper pertains to web harvesting as a process for data mining from web and only through web (“pull ” function); this paper elaborates upon research implemented in the framework of the funded research project titled “Web Archiving in Public Libraries and IP Law ” that focused on the processes of web-harvesting and archiving as well as Text and Data Mining (TDM) operations in the national libraries of EU Member States. Web archiving as an official operation in national libraries of EU Member States creates web collections and preserves them for the purpose of being accessible and usable in perpetuity. This paper pertains to research on various components of web harvesting and archiving through an online survey (qualitative research) which targeted the national libraries of EU Member States. The research team of authors posed seventeen questions to EU national libraries. The survey output comes from answers delivered by 22 national libraries of EU Member States. The questionnaire was created through the use of Google forms. The researchers reached the EU national libraries via email and follow up telephone calls seeking libraries’ participation in the research. The aim of the research was to delve on participant libraries’ Text and Data Mining operation leveraging on Web harvesting and Web archiving technologies and operations. Results analysis reveals that web harvesting is considered among national libraries’ top priorities; the relevant projects increase in number, the web collections become more and more and the technological infrastructures and tools for web harvesting improve. Yet, there are many issues that remain unresolved. A significant number of surveyed libraries consider that legal and technical issues remain the most important to resolve. Access to harvested material is still under legal restrictions. The Directive 2019/790/EU on Copyright in the Digital Single Market (DSM) creates a favorable legal foundation for the deployment of web harvesting operations in national libraries of the EU Member States. TDM technologies make possible new areas of research. Web harvesting that was initially aimed for preservation purposes now expands to unprecedented research of national heritage through state-of-the-art automated TDM processes.
机译:几十年的网络收获和归档经验; Web收获和Web归档的主题是研究人员,技术人员和图书馆员信息科学家的最大值。关于网络上追溯的归档内容的网络收获项目和试点计划正在成为欧盟国家图书馆和文化遗产组织的优先事项。本文涉及Web收获作为从Web和才能通过网站挖掘的过程(“拉动”功能);本文阐述了在题为“公共图书馆和知识产权法”中的“

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号