首页> 外国专利> CLUSTERING METHODS USING A GRAND CANONICAL ENSEMBLE

CLUSTERING METHODS USING A GRAND CANONICAL ENSEMBLE

机译：使用大规范封装的聚类方法

页面导航

摘要
著录项
相似文献

摘要

Methods are disclosed for clustering biological samples and other objects using a grand canonical ensemble. A biological sample is characterized by data attributes from varying sources (e.g. NGS, other types of high-dimensional cytometric data, observed disease state) and of varying data types (e.g. Boolean, continuous, or coded sets) organized as vectors (as many as 10⁹) having as many as 10⁶, 10⁹, or more components. The biological samples or observational data are modeled as particles of a grand canonical ensemble which can be variably distributed among partitions. A pseudo-energy is defined as a measure of inverse similarity between the particles. Minimization of grand canonical ensemble pseudo-energy corresponds to clustering maximally similar particles in each partition, thereby determining clusters of the biological samples. The sample clusters can be used for feature discovery, gene and pathway identification, and development of cell based therapeutics, or for other purposes. Variations and additional applications are disclosed.

机译：公开了使用大正则集合对生物样本和其他对象进行聚类的方法。生物样品的特征在于，数据来源来自各种来源（例如NGS，其他类型的高维细胞计数数据，观察到的疾病状态）以及组织为矢量（多达200种）的多种数据类型（例如布尔值，连续值或编码集） 10 ^{9 ）具有最多10 ^{6 ，10 ^{9 或更多组件。生物学样本或观测数据被建模为一个大正则合奏的粒子，可以在分区之间可变地分布。伪能量定义为粒子之间逆相似度的量度。大正则合奏伪能量的最小化对应于每个分区中最大相似粒子的聚类，从而确定生物样本的聚类。样品簇可用于特征发现，基因和途径鉴定以及基于细胞的治疗剂的开发，或用于其他目的。公开了变体和附加应用。}}}

著录项

公开/公告号US2020311384A1

专利类型
公开/公告日2020-10-01

原文格式PDF
申请/专利权人 THE UNITED STATES OF AMERICA AS REPRESENTED BY THE SECRETARY DEPARTMENT OF HEALTH AND HUMAN SERVIC;
展开▼

申请/专利号US201816764557
发明设计人 ELAINE ELLEN THOMPSON;VAHAN SIMONYAN;MALCOLM MOOS JR.;
展开▼

申请日2018-11-15
分类号G06K9;G06T7;
国家 US
入库时间 2022-08-21 11:22:28

相似文献

专利
外文文献
中文文献