...
首页> 外文期刊>Biodiversity Informatics >Approaches to estimating the universe of natural history collections data
【24h】

Approaches to estimating the universe of natural history collections data

机译:估算自然历史收集数据范围的方法

获取原文
           

摘要

This contribution explores the problem of recognizing and measuring the universe of specimen-level data existing in Natural History Collections around the world, in absence of a complete, world-wide census or register. Estimates of size seem necessary to plan for resource allocation for digitization or data capture, and may help represent how many vouchered primary biodiversity data (in terms of collections, specimens or curatorial units) might remain to be mobilized. Three general approaches are proposed for further development, and initial estimates are given. Probabilistic models involve crossing data from a set of biodiversity datasets, finding commonalities and estimating the likelihood of totally obscure data from the fraction of known data missing from specific datasets in the set. Distribution models aim to find the underlying distribution of collections’ compositions, figuring out the occult sector of the distributions. Finally, case studies seek to compare digitized data from collections known to the world to the amount of data known to exist in the collection but not generally available or not digitized. Preliminary estimates range from 1.2 to 2.1 gigaunits, of which a mere 3% at most is currently web-accessible through GBIF’s mobilization efforts. However, further data and analyses, along with other approaches relying more heavily on surveys, might change the picture and possibly help narrow the estimate. In particular, unknown collections not having emerged through literature are the major source of uncertainty.
机译:这项工作探讨了在没有完整的,全球普查或登记的情况下,识别和测量世界范围内自然史收藏中存在的标本水平数据的宇宙的问题。大小估计似乎对于规划数字化或数据捕获的资源分配似乎是必要的,并且可能有助于表示有多少凭证的主要生物多样性数据(就收集,标本或策展单位而言)仍需动员。提出了三种一般方法进行进一步开发,并给出了初步估计。概率模型涉及从一组生物多样性数据集中交叉数据,找到共同点,并从该组特定数据集中丢失的已知数据中估算出完全模糊的数据的可能性。分布模型旨在查找馆藏构成的基本分布,从而找出分布的隐性部门。最后,案例研究试图将世界范围内已知的馆藏中的数字化数据与馆藏中已知但通常无法获得或未数字化的数据量进行比较。初步估计范围为1.2到2.1吉单位,通过GBIF的动员努力,目前最多只能通过网络访问3%。但是,进一步的数据和分析以及其他更依赖调查的方法可能会改变情况,并可能有助于缩小估计范围。尤其是,没有文献收集的未知藏品是不确定性的主要来源。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号