首页> 外国专利> CLUSTERING METHODS USING A GRAND CANONICAL ENSEMBLE

CLUSTERING METHODS USING A GRAND CANONICAL ENSEMBLE

机译:使用大规范封装的聚类方法

摘要

Methods are disclosed for clustering biological samples and other objects using a grand canonical ensemble. A biological sample is characterized by data attributes from varying sources (e.g. NGS, other types of high-dimensional cytometric data, observed disease state) and of varying data types (e.g. Boolean, continuous, or coded sets) organized as vectors (as many as 109) having as many as 106, 109, or more components. The biological samples or observational data are modeled as particles of a grand canonical ensemble which can be variably distributed among partitions. A pseudo-energy is defined as a measure of inverse similarity between the particles. Minimization of grand canonical ensemble pseudo-energy corresponds to clustering maximally similar particles in each partition, thereby determining clusters of the biological samples. The sample clusters can be used for feature discovery, gene and pathway identification, and development of cell based therapeutics, or for other purposes. Variations and additional applications are disclosed.
机译:公开了使用大正则集合对生物样本和其他对象进行聚类的方法。生物样品的特征在于,数据来源来自各种来源(例如NGS,其他类型的高维细胞计数数据,观察到的疾病状态)以及组织为矢量(多达200种)的多种数据类型(例如布尔值,连续值或编码集) 10 9 )具有最多10 6 ,10 9 或更多组件。生物学样本或观测数据被建模为一个大正则合奏的粒子,可以在分区之间可变地分布。伪能量定义为粒子之间逆相似度的量度。大正则合奏伪能量的最小化对应于每个分区中最大相似粒子的聚类,从而确定生物样本的聚类。样品簇可用于特征发现,基因和途径鉴定以及基于细胞的治疗剂的开发,或用于其他目的。公开了变体和附加应用。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号