...
【24h】

Subset clustering of binary sequences, with an application to genomic abnormality data.

机译:二进制序列的子集聚类,应用于基因组异常数据。

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

This article develops a model-based approach to clustering multivariate binary data, in which the attributes that distinguish a cluster from the rest of the population may depend on the cluster being considered. The clustering approach is based on a multivariate Dirichlet process mixture model, which allows for the estimation of the number of clusters, the cluster memberships, and the cluster-specific parameters in a unified way. Such a clustering approach has applications in the analysis of genomic abnormality data, in which the development of different types of tumors may depend on the presence of certain abnormalities at subsets of locations along the genome. Additionally, such a mixture model provides a nonparametric estimation scheme for dependent sequences of binary data.
机译:本文开发了一种基于模型的方法来对多元二进制数据进行聚类,其中区分聚类和其他总体的属性可能取决于所考虑的聚类。聚类方法基于多元Dirichlet过程混合模型,该模型允许以统一的方式估算聚类数量,聚类成员以及特定于聚类的参数。这样的聚类方法在基因组异常数据的分析中具有应用,其中不同类型的肿瘤的发展可能取决于在沿着基因组的位置的子集处某些异常的存在。另外,这种混合模型为二进制数据的相关序列提供了非参数估计方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号