首页> 外文期刊>Royal Society Open Science >Factors influencing taxonomic unevenness in scientific research: a mixed-methods case study of non-human primate genomic sequence data generation
【24h】

Factors influencing taxonomic unevenness in scientific research: a mixed-methods case study of non-human primate genomic sequence data generation

机译:影响科学研究的分类学不均匀的因素:非人灵长类动物序列数据生成的混合方法研究

获取原文
           

摘要

Scholars have noted major disparities in the extent of scientific research conducted among taxonomic groups. Such trends may cascade if future scientists gravitate towards study species with more data and resources already available. As new technologies emerge, do research studies employing these technologies continue these disparities? Here, using non-human primates as a case study, we identified disparities in massively parallel genomic sequencing data and conducted interviews with scientists who produced these data to learn their motivations when selecting study species. We tested whether variables including publication history and conservation status were significantly correlated with publicly available sequence data in the NCBI Sequence Read Archive (SRA). Of the 179.6 terabases (Tb) of sequence data in SRA for 519 non-human primate species, 135 Tb (approx. 75%) were from only five species: rhesus macaques, olive baboons, green monkeys, chimpanzees and crab-eating macaques. The strongest predictors of the amount of genomic data were the total number of non-medical publications (linear regression; r 2 = 0.37; p = 6.15 × 10 ?12 ) and number of medical publications ( r 2 = 0.27; p = 9.27 × 10 ?9 ). In a generalized linear model, the number of non-medical publications ( p = 0.00064) and closer phylogenetic distance to humans ( p = 0.024) were the most predictive of the amount of genomic sequence data. We interviewed 33 authors of genomic data-producing publications and analysed their responses using grounded theory. Consistent with our quantitative results, authors mentioned their choice of species was motivated by sample accessibility, prior published work and relevance to human medicine. Our mixed-methods approach helped identify and contextualize some of the driving factors behind species-uneven patterns of scientific research, which can now be considered by funding agencies, scientific societies and research teams aiming to align their broader goals with future data generation efforts.
机译:学者们在分类群中进行的科学研究范围内提出了重大差异。如果未来的科学家倾向于使用更多数据和资源已经获得的学习物种,这种趋势可能会级联。随着新技术的出现,采用这些技术的研究研究继续这些差异吗?这里,使用非人类激励作为案例研究,我们确定了大规模平行的基因组测序数据的差异,并与制作这些数据的科学家进行了访谈,以在选择研究种类时学习其动机。我们测试了包括出版历史和保护状态的变量是否与NCBI序列读取档案(SRA)中的公共可用序列数据显着相关。在SRA中的序列数据的179.6百分之一(TB)中,为519个非人类灵长类动物种类,135 TB(约75%)来自仅五种物种:恒河猴,橄榄狒狒,绿猴,黑猩猩和螃蟹吃猕猴。基因组数据量的最强预测因子是非医学出版物的总数(线性回归; R 2 = 0.37; P = 6.15×10?12)和医学出版物的数量(R 2 = 0.27; P = 9.27× 10?9)。在广义的线性模型中,非医学出版物的数量(p = 0.00064)和与人类的更近的系统发育距离(p = 0.024)是基因组序列数据量的最预测。我们采访了33名基因组数据产生的出版物,并使用基础理论分析了他们的反应。与我们的定量结果一致,提及他们选择的物种是通过样品可访问性,先前公布的工作和与人类相关性的影响。我们的混合方法方法有助于识别和上下文化物种的一些驱动因素 - 科学研究的不均匀模式,现在可以通过资金机构,科学社会和研究团队考虑,旨在将更广泛的目标与未来的数据发电努力保持一致。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号