首页> 外文会议>International work-conference on bioinformatics and biomedical engineering >Describing Sequential Association Patterns from Longitudinal Microarray Data Sets in Humans
【24h】

Describing Sequential Association Patterns from Longitudinal Microarray Data Sets in Humans

机译:从人类中的纵向微阵列数据集描述顺序关联模式

获取原文

摘要

DNA microarray technology provides a powerful vehicle for exploring biological processes on a genomic scale. Machine-learning approaches such as association rule mining (ARM) have been proven very effective in extracting biologically relevant associations among different genes. Despite of the usefulness of ARM, time relations among associated genes cannot be modeled with a standard ARM approach, though temporal information is critical for the understanding of regulatory mechanisms in biological processes. Sequential rule mining (SRM) methods have been proposed for mining temporal relations in temporal data instead. Although successful, existing SRM applications on temporal microarray data have been exclusively designed for in vitro experiments in yeast and none extension to in vivo data sets has been proposed to date. Contrary to what happen with in vitro experiments, when dealing with microarray data derived from humans or animals the "subject variability" is the main issue to address, so that databases include multiple sequences instead of a single one. A wide variety of SRM approaches could be used to handle with these particularities. In this study, we propose an adaptation of the particular SRM method "CMRules" to extract sequential association rules from temporal gene expression data derived from humans. In addition to the data mining process, we further propose the validation of extracted rules through the integration of results along with external resources of biological knowledge (functional and pathway annotation databases). The employed data set consists on temporal gene expression data collected in three different time points during the course of a dietary intervention in 57 subjects with obesity (data set available with identifier GSE77962 in the Gene Expression Omnibus repository). Published by Vink [1], the original clinical trial investigated the effects on weight loss of two different dietary interventions (a low-calorie diet or a very low-calorie diet). In conclusion, the proposed method demonstrated a good ability to extract sequential association rules with further biological relevance within the context of obesity. Thus, the application of this method could be successfully extended to other longitudinal microarray data sets derived from humans.
机译:DNA微阵列技术为探索基因组规模的生物过程提供了强大的载体。事实证明,诸如关联规则挖掘(ARM)之类的机器学习方法在提取不同基因之间的生物学相关联方面非常有效。尽管ARM很有用,但是关联基因之间的时间关系无法使用标准ARM方法建模,尽管时间信息对于理解生物过程中的调控机制至关重要。已经提出了顺序规则挖掘(SRM)方法来挖掘时间数据中的时间关系。尽管成功,但暂时性微阵列数据上现有的SRM应用已专门设计用于酵母中的体外实验,并且迄今尚未提出对体内数据集的扩展。与体外实验相反,处理来自人或动物的微阵列数据时,“受试者变异性”是要解决的主要问题,因此数据库包含多个序列,而不是单个序列。可以使用多种SRM方法来处理这些特殊性。在这项研究中,我们提出了一种特殊的SRM方法“ CMRules”的改编,以从人类的时间基因表达数据中提取顺序关联规则。除了数据挖掘过程之外,我们还建议通过将结果与生物学知识的外部资源(功能和途径注释数据库)集成在一起,对提取的规则进行验证。所使用的数据集包括在57位肥胖症患者的饮食干预过程中在三个不同时间点收集的时间基因表达数据(该数据集可在Gene Expression Omnibus存储库中使用标识符GSE77962获得)。由Vink发表[1]的原始临床试验研究了两种不同饮食干预措施(低热量饮食或极低热量饮食)对减肥的影响。总之,所提出的方法显示出良好的能力,可以在肥胖的背景下提取具有进一步生物学相关性的顺序关联规则。因此,该方法的应用可以成功地扩展到其他源自人类的纵向微阵列数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号