Empirical Research on Web Harvesting in the Process of Text and Data Mining in National Libraries of EU Member States

Marinos Papadopoulos; Maria Botti; M. A. Paraskevi (Vicky) Ganatsiou; Christos Zampakolas

首页> 外文期刊>Open Journal of Philosophy >Empirical Research on Web Harvesting in the Process of Text and Data Mining in National Libraries of EU Member States

【24h】

Empirical Research on Web Harvesting in the Process of Text and Data Mining in National Libraries of EU Member States

机译：欧盟国家图书馆文本和数据挖掘过程中的媒体收割实证研究

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Almost two decades of experience on web harvesting and archiving are counted; the subject of web harvesting and web archiving have been top in the interest of researchers, technologists and librarians-information scientists. Web harvesting projects and pilot programs on archiving content traced on the Web are becoming priorities for national libraries and cultural heritage organizations in the EU. This paper pertains to web harvesting as a process for data mining from web and only through web (“pull ” function); this paper elaborates upon research implemented in the framework of the funded research project titled “Web Archiving in Public Libraries and IP Law ” that focused on the processes of web-harvesting and archiving as well as Text and Data Mining (TDM) operations in the national libraries of EU Member States. Web archiving as an official operation in national libraries of EU Member States creates web collections and preserves them for the purpose of being accessible and usable in perpetuity. This paper pertains to research on various components of web harvesting and archiving through an online survey (qualitative research) which targeted the national libraries of EU Member States. The research team of authors posed seventeen questions to EU national libraries. The survey output comes from answers delivered by 22 national libraries of EU Member States. The questionnaire was created through the use of Google forms. The researchers reached the EU national libraries via email and follow up telephone calls seeking libraries’ participation in the research. The aim of the research was to delve on participant libraries’ Text and Data Mining operation leveraging on Web harvesting and Web archiving technologies and operations. Results analysis reveals that web harvesting is considered among national libraries’ top priorities; the relevant projects increase in number, the web collections become more and more and the technological infrastructures and tools for web harvesting improve. Yet, there are many issues that remain unresolved. A significant number of surveyed libraries consider that legal and technical issues remain the most important to resolve. Access to harvested material is still under legal restrictions. The Directive 2019/790/EU on Copyright in the Digital Single Market (DSM) creates a favorable legal foundation for the deployment of web harvesting operations in national libraries of the EU Member States. TDM technologies make possible new areas of research. Web harvesting that was initially aimed for preservation purposes now expands to unprecedented research of national heritage through state-of-the-art automated TDM processes.

机译：几十年的网络收获和归档经验; Web收获和Web归档的主题是研究人员，技术人员和图书馆员信息科学家的最大值。关于网络上追溯的归档内容的网络收获项目和试点计划正在成为欧盟国家图书馆和文化遗产组织的优先事项。本文涉及Web收获作为从Web和才能通过网站挖掘的过程（“拉动”功能）;本文阐述了在题为“公共图书馆和知识产权法”中的“ 展开▼

著录项

来源
《Open Journal of Philosophy》 |2020年第1期|共25页

作者
Marinos Papadopoulos; Maria Botti; M. A. Paraskevi (Vicky) Ganatsiou; Christos Zampakolas;
展开▼

作者单位

展开▼

收录信息

原文格式 PDF

正文语种

中图分类

关键词
TDMWeb HarvestingWeb ArchivingNational LibrariesSurvey;

机译：Tdmweb harvestingweb archivingnationational libriesurvey;

相似文献

外文文献

中文文献

专利

1. Text and Data Mining in Directive 2019/790/EU Enhancing Web-Harvesting and Web-Archiving in Libraries and Archives [J] . Μaria Bottis, Marinos Papadopoulos, Christos Zampakolas, Open Journal of Philosophy . 2019,第3期

机译：指令2019/790 / EU中的文本和数据挖掘增强了图书馆和档案馆的Web采集和Web归档

2. Text mining: An analysis of research published under the subject category Information Science Library Science' in Web of Science Database during 1999-2013 [J] . Shubhada Prashant Nagarkar, Rajendra Kumbhar Library Review . 2015,第3期

机译：文本挖掘：对1999-2013年间在Web of Science数据库中主题类别为Information Science Library Science的研究的分析

3. A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data [J] . Dreisbach Caitlin, Koleck Theresa A., Bourne Philip E., International journal of medical informatics . 2019,第MAY期

机译：对电子患者撰写的文本数据中自然语言处理和症状的文本挖掘的系统评价

4. Integrating data and text mining processes for digital library applications [C] . Robert Sanderson, Paul Watry, PRobert Sanderson ACM/IEEE-CS joint conference on Digital libraries . 2007

机译：集成数据和文本挖掘过程以用于数字图书馆应用

5. Improving Arabic text processing via stemming with application to text mining and web retrieval [D] . Al-Shammari, Eiman Tamah. 2010

机译：通过将其应用于文本挖掘和Web检索来改善阿拉伯文本的处理

6. Web services-based text-mining demonstrates broad impacts for interoperability and process simplification [O] . Thomas C. Wiegers, Allan Peter Davis, Carolyn J. Mattingly 2014

机译：基于Web服务的文本挖掘展示了对互操作性和流程简化的广泛影响

7. Empirical Research on Web Harvesting in the Process of Text and Data Mining in National Libraries of EU Member States [O] . Marinos Papadopoulos, Maria Botti, M. A. Paraskevi (Vicky) Ganatsiou, 2020

机译：欧盟国家图书馆文本和数据挖掘过程中的媒体收割实证研究

1. 中国精英民营企业家的国家形象承载力实证研究——基于2017微信自媒体10w+文本的智能数据挖掘 [J] . 郑晨予 ,韦龙 . 新闻大学 . 2018,第005期

2. 欧盟《数字化单一市场版权指令》文本数据挖掘版权例外制度评析 [J] . 杨小桐 . 法制与社会 . 2020,第012期

3. 欧盟版权法下的文本与数据挖掘例外 [J] . 阮开欣 . 图书馆论坛 . 2019,第012期

4. 欧盟文本与数据挖掘新策解析 [J] . 周玲玲 . 图书馆建设 . 2017,第007期

5. 文本与数据挖掘的版权例外——以欧盟版权指令修改草案为视角 [J] . 宋雅馨 . 电子知识产权 . 2017,第006期

6. 信息公开在欧盟法的发展与实践——以欧盟法院判决为解读文本 [C] . 王敬波 . 2010行政法年会 . 2010

7. 基于社交媒体的短文本数据挖掘研究 [A] . 杜娜娜 . 2018

1. 基于制造过程文本数据挖掘的质量分析方法与系统 [P] . 中国专利： CN108304382B . 2021.02.02

2. 基于事件文本数据挖掘的地下水水位分析方法与系统 [P] . 中国专利： CN108182178B . 2021.06.18

3. Text mining system for analysis target data, a text mining method for analysis target data and a recording medium for recording analysis target data [P] . 外国专利： US8805853B2 . 2014-08-12

机译：用于分析目标数据的文本挖掘系统，用于分析目标数据的文本挖掘方法和用于记录分析目标数据的记录介质

4. Computerised data retrieval system for subject based research - accesses library databases via national and international switched telephone networks [P] . 外国专利： NL9100425A . 1992-10-01

机译：基于主题的研究的计算机化数据检索系统-通过国内和国际电话交换网访问图书馆数据库

5. Data mining recommendation web beans and JSP tag libraries [P] . 外国专利： US6873984B1 . 2005-03-29

机译：数据挖掘推荐Web Bean和JSP标记库

相关主题

Empirical Research on Web Harvesting in the Process of Text and Data Mining in National Libraries of EU Member States

摘要

著录项

相似文献

相关主题

期刊订阅