首页> 外文学位 >Bioinformatics approaches to heterogeneous omic data integration.
【24h】

Bioinformatics approaches to heterogeneous omic data integration.

机译:生物信息学方法可用于异构Omic数据集成。

获取原文
获取原文并翻译 | 示例

摘要

With the advent of whole-genome sequencing and high-throughput microarray experimental technologies, an important challenge confronting researchers lies in analyzing these large-scale data sets and extracting discernable biological information from them. Existing statistical or bioinformatical approaches have known limitations on extracting the most compact sets of genes to distinguish between tumor subtype/phenotypes and integrating heterogeneous high dimensional omic data effectively. Therefore, powerful approaches/pipelines that would facilitate data management, visualization and integration for biological findings are warranted, and should be executed without sacrificing statistical rigor and computational efficiency. Multivariate analyses that target characterization of the complex and hierarchical interplay between multiple genes at various levels would be useful as well. To overcome known limitations of standard statistical approaches, two specific projects were proposed. First, with the goal to improve the prediction accuracy while considering sparsity, a non-parametric, iterative algorithm, Splitting Random Forest (SRF), was developed, to robustly identify genes that distinguish between subtypes/phenotypes. The manuscript from this new algorithm is now published at Journal of Clinical Bioinformatics. Second, I focus on integration of the omic data sequentially using currently acceptable bioinformatical approaches to derive biological insight for glioma progression by applying known glioblastoma (GBM) expression and methylation based subtypes to lower grade gliomas (LGGs; grade II-III gliomas), while examining copy number changes. The manuscript from this analysis is now under review at a top tier cancer journal.
机译:随着全基因组测序和高通量微阵列实验技术的出现,研究人员面临的重要挑战在于分析这些大规模数据集并从中提取可识别的生物学信息。现有的统计或生物信息学方法在提取最紧凑的基因集以区分肿瘤亚型/表型和有效整合异质高维Omic数据方面存在已知限制。因此,需要有力的方法/管道来促进生物学结果的数据管理,可视化和集成,并且应该在不牺牲统计严格性和计算效率的情况下执行这些方法/管道。以不同水平的多个基因之间复杂和层次相互作用为特征的多变量分析也将是有用的。为了克服标准统计方法的已知局限性,提出了两个具体项目。首先,为了在考虑稀疏性的同时提高预测准确性,开发了一种非参数迭代算法,即分裂随机森林(SRF),以可靠地识别可区分亚型/表型的基因。这种新算法的手稿现已发表在《临床生物信息学杂志》上。其次,我专注于通过使用已知的胶质母细胞瘤(GBM)表达和基于甲基化的亚型将低级神经胶质瘤(LGGs; II-III级神经胶质瘤)应用目前可接受的生物信息学方法,顺序地整合眼科数据,从而获得对胶质瘤进展的生物学见解。检查副本号更改。该分析的手稿现在正在顶级癌症杂志上进行审查。

著录项

  • 作者

    Guan, Xiaowei.;

  • 作者单位

    Case Western Reserve University.;

  • 授予单位 Case Western Reserve University.;
  • 学科 Bioinformatics.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 146 p.
  • 总页数 146
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号