首页> 外文OA文献 >TicTOCron: an Automatic Solution for Propagating Quality Metadata to Scholarly TOC RSS Feed Metadata
【2h】

TicTOCron: an Automatic Solution for Propagating Quality Metadata to Scholarly TOC RSS Feed Metadata

机译:TicTOCron:将质量元数据传播到学术性TOC RSS Feed元数据的自动解决方案

摘要

Institutions and researchers stand to benefit from the facilitation of more widespread syndication of, and easier access to, Table of Content (TOC) RSS (Really Simple Syndication [1]) feeds produced for scholarly journals. However, many journal TOC RSS feeds are at present being produced with erroneous, poor or incomplete metadata. This can hamper the usefulness of scholarly current awareness services, and also cause problems for individual subscribers to those feeds. This is exactly what the ticTOCron software toolkit aims to overcome. The ticTOCron toolkit automatically enhances poor, heterogeneous and incomplete metadata found in TOC RSS feeds by making use of a pre-defined "Best Practice" metadata scheme suitable for scholarly journals. In this work we depict the main issues and "bad practices" found in TOC RSS metadata obtained from more than 435 scholarly publishers. Then, we describe software solutions implemented via ticTOCron. Some references are made to the algorithms for generating semantic relations within, between and from the harvested TOCs and to the mechanisms for propagating "metadata associations" from a previously crawled metadata-rich reference set. However, an effort is made to avoid technical jargon and to replace complex technical descriptions with samples and simple comparisons. The original metadata is converted to a canonical format using the "Best Practices metadata set" for scholarly papers proposed by the ticTOCs Project [2]. We also present the results produced by ticTOCron when it was used for enhancing and normalizing TOC RSS feeds collected from more than 12,000 journals. Finally we propose a sustainable and scalable computational model whereby the automatic solution is complemented and fine-tuned by a cost-effective human cross-validation process.
机译:机构和研究人员可以从为学术期刊制作的目录(TOC)RSS(真正简单的联合组织[1])提要获得更广泛的联合组织和更容易获得的便利中受益。但是,目前许多期刊TOC RSS提要使用的元数据是错误的,不良的或不完整的。这可能会妨碍学术界当前的意识服务的有用性,并且还会给那些提要的单个订户带来问题。这正是ticTOCron软件工具包旨在克服的目标。 ticTOCron工具包通过使用适用于学术期刊的预定义“最佳实践”元数据方案,自动增强了TOC RSS提要中发现的不良,异构和不完整的元数据。在这项工作中,我们描述了从超过435个学术出版商那里获得的TOC RSS元数据中发现的主要问题和“不良做法”。然后,我们描述通过ticTOCron实现的软件解决方案。对用于在所收获的TOC之内,之间以及从所收获的TOC之间生成语义关系的算法,以及从先前爬网的元数据丰富的参考集中传播“元数据关联”的机制,进行了一些参考。但是,已尽力避免使用技术术语,并用示例和简单的比较来代替复杂的技术描述。使用ticTOCs项目[2]提出的学术论文的“最佳实践元数据集”将原始元数据转换为规范格式。我们还介绍了ticTOCron用于增强和规范化从12,000多种期刊收集的TOC RSS提要时所产生的结果。最后,我们提出了一种可持续且可扩展的计算模型,其中,自动解决方案通过具有成本效益的人员交叉验证过程进行了补充和微调。

著录项

  • 作者单位
  • 年度 2009
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号