首页> 美国卫生研究院文献>Biostatistics (Oxford England) >Compound hierarchical correlated beta mixture with an application to cluster mouse transcription factor DNA binding data
【2h】

Compound hierarchical correlated beta mixture with an application to cluster mouse transcription factor DNA binding data

机译:化合物分层相关的β混合物及其在聚集小鼠转录因子DNA结合数据中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Modeling correlation structures is a challenge in bioinformatics, especially when dealing with high throughput genomic data. A compound hierarchical correlated beta mixture (CBM) with an exchangeable correlation structure is proposed to cluster genetic vectors into mixture components. The correlation coefficient, , is homogenous within a mixture component and heterogeneous between mixture components. A random CBM with brings more flexibility in explaining correlation variations among genetic variables. Expectation–Maximization (EM) algorithm and Stochastic Expectation–Maximization (SEM) algorithm are used to estimate parameters of CBM. The number of mixture components can be determined using model selection criteria such as AIC, BIC and ICL-BIC. Extensive simulation studies were conducted to compare EM, SEM and model selection criteria. Simulation results suggest that CBM outperforms the traditional beta mixture model with lower estimation bias and higher classification accuracy. The proposed method is applied to cluster transcription factor–DNA binding probability in mouse genome data generated by (, Probabilistic inference of transcription factor binding from multiple data sources. PLoS One, >3, e1820). The results reveal distinct clusters of transcription factors when binding to promoter regions of genes in JAK–STAT, MAPK and other two pathways.
机译:在生物信息学中,尤其是在处理高通量基因组数据时,对相关结构进行建模是一个挑战。提出了具有可交换相关结构的复合层次相关β混合物(CBM),以将遗传载体聚类为混合物组分。相关系数在混合物成分内是同质的,在混合物成分之间是异质的。随机CBM在解释遗传变量之间的相关性变化方面具有更大的灵活性。期望最大化(EM)算法和随机期望最大化(SEM)算法用于估计煤层气的参数。可以使用模型选择标准(例如AIC,BIC和ICL-BIC)确定混合物组分的数量。进行了广泛的模拟研究,以比较EM,SEM和模型选择标准。仿真结果表明,煤层气以较低的估计偏差和较高的分类精度优于传统的β混合模型。该方法应用于小鼠基因组数据中的簇转录因子-DNA结合概率,该概率由(,来自多个数据源的转录因子结合的概率推断。PLoSOne,> 3 ,e1820)产生。结果显示,当与JAK–STAT,MAPK和其他两种途径的基因的启动子区域结合时,转录因子簇不同。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号