首页> 外文期刊>engenharia agricola >DATA MINING-BASED TECHNIQUE ON SHEEP BREED CERTIFICATION
【24h】

DATA MINING-BASED TECHNIQUE ON SHEEP BREED CERTIFICATION

机译:基于数据挖掘的绵羊品种认证技术

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

This study aimed at developing a method based on data mining techniques to select key SNP markers (Single Nucleotide Polymorphism) for the sheep breeds Crioula, Morada Nova and Santa Ines. We gathered data from the International Sheep Consortium of 72 animals belonging to the aforementioned breeds; each animal has 49,034 SNP markers. Whereas the number of attributes (markers) is much greater than observations (animals), the LASSO (Least Absolute Shrinkage and Selection Operator), Random Forest and Boosting prediction methods were used to generate predictive models, incorporating selection methods and attributes. The results revealed that the predictive models selected the main SNP markers for sheep breed identification. The LASSO technique selected 29 relevant markers. Yet from Random Forest and Boosting selected 27 and 20 major markers, respectively. By intersecting the generated models, we could identify a subset of 18 markers with major potential for sheep breed identification.
机译:本研究旨在开发一种基于数据挖掘技术的羊品种Crioula、Morada Nova和Santa Ines的关键SNP标记(单核苷酸多态性)选择方法。我们从国际绵羊联盟收集了属于上述品种的 72 只动物的数据;每只动物有 49,034 个 SNP 标记。虽然属性(标记)的数量远大于观测值(动物),但 LASSO(最小绝对收缩和选择算子)、随机森林和提升预测方法用于生成预测模型,并结合了选择方法和属性。结果表明,预测模型选择主要的SNP标记进行绵羊品种鉴定。LASSO技术选择了29个相关标记。然而,从 Random Forest 和 Boosting 中分别选择了 27 个和 20 个主要标记。通过与生成的模型相交,我们可以识别出 18 个标记的子集,这些标记具有识别绵羊品种的主要潜力。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号