首页> 外文学位 >Clustering of mixed data types with application to toxicogenomics.
【24h】

Clustering of mixed data types with application to toxicogenomics.

机译:混合数据类型的聚类及其在毒理基因组学中的应用。

获取原文
获取原文并翻译 | 示例

摘要

DNA microarray analysis provides unprecedented capabilities for simultaneous measurement of genome-wide alterations in transcription levels. Toxicogenomics bridges gene and protein expression analyses with conventional toxicology to elucidate a global view of the toxic outcomes and mechanistic changes elicited by toxicant exposure and environmental stressors to biological systems. Inherent in toxicogenomics data are systematic error, stochastic variation and disparate measurement domains and types which complicate the acquisition of significant, meaningful and broad biological interpretations from analysis of the data. In this dissertation, a classification regimen comprised of analysis of replicate data, outlier diagnostics and gene selection procedures was employed to utilize microarray data for categorization of sub-classes of biological samples exposed to pharmacologic agents. To assess contrasts of centrilobular congestion severity of the rat liver subsequent to exposure with acetaminophen (APAP), microarray data, clinical chemistry evaluations and histopathology observations were integrated in a database and analyzed using mixed linear model approaches. Finally, the k-prototype algorithm with a mixed objective function comprised of the sum of the squared Euclidean distance to measure the dissimilarity of samples based on microarray array and clinical chemistry numeric data features and simple matching to measure the dissimilarity of the samples based on histopathology features with categorical values, was modified (Modk-prototypes) to the specifications of k-means clustering. In addition, the objective function included weighting terms for the microarray, clinical chemistry and histopathology domain data in order to computationally integrate the data as well as constrain the clustering of the APAP-treated samples according to similarity of gene expression and toxicological profiles. Simulated annealing optimization of the Modk (SA-Modk)-prototypes algorithm was used to validate the clustering of the APAP-treated samples. The clusters were vetted for gene expression and toxicological (VETed) k-prototypes features that discerned clusters from one another. The VETed k-prototypes are shown to be ideal for distinguishing between zero, minimal, and moderate levels of necrosis of the hepatocytes and centrilobular region of the rat liver that are end-point representations of the clusters of APAP-treated samples. (Abstract shortened by UMI.)
机译:DNA微阵列分析提供了前所未有的功能,可同时测量全基因组转录水平的变化。毒物基因组学将基因和蛋白质表达分析与常规毒理学联系起来,以阐明由毒物暴露和环境压力对生物系统引起的毒性结果和机制变化的全局视图。毒物基因组学数据固有的是系统误差,随机变化以及不同的测量域和类型,这使得从数据分析中获取重要的,有意义的和广泛的生物学解释变得复杂。在本文中,采用由重复数据分析,异常诊断和基因选择程序组成的分类方案,利用微阵列数据对暴露于药理作用的生物样品的亚类进行分类。为了评估对乙酰氨基酚(APAP)暴露后大鼠肝脏小叶中心充血严重程度的差异,将微阵列数据,临床化学评估和组织病理学观察结果整合到数据库中,并使用混合线性模型方法进行了分析。最后,具有混合目标函数的k-原型算法由平方欧几里德距离的总和组成以基于微阵列阵列和临床化学数值数据特征来测量样品的相异性,并通过简单匹配基于组织病理学来测量样品的相异性具有分类值的特征已被修改(Modk原型),以适应k均值聚类的规范。此外,目标函数包括微阵列,临床化学和组织病理学领域数据的加权项,以便根据基因表达和毒理学特征的相似性,对数据进行计算整合并约束APAP处理的样品的聚类。 Modk(SA-Modk)-原型算法的模拟退火优化用于验证APAP处理的样品的聚类。对这些簇进行了基因表达和毒理学(VETed)k-原型特征的审查,这些特征可以彼此区分。 VETed k原型被证明是区分零,最小和中等水平的大鼠肝肝细胞和小叶中心区域坏死的理想选择,这些坏死是APAP处理过的样品簇的终点表示。 (摘要由UMI缩短。)

著录项

  • 作者

    Bushel, Pierre Robert.;

  • 作者单位

    North Carolina State University.;

  • 授予单位 North Carolina State University.;
  • 学科 Biology Molecular.;Biology Bioinformatics.;Statistics.;Biology Biostatistics.
  • 学位 Ph.D.
  • 年度 2005
  • 页码 233 p.
  • 总页数 233
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号