首页> 美国卫生研究院文献>Cancer Informatics >Phenotype Classification Using Moment Features of Single-Cell Data
【2h】

Phenotype Classification Using Moment Features of Single-Cell Data

机译:利用单细胞数据的矩特征进行表型分类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Features for standard expression microarray and RNA-Seq classification are expression averages over collections of cells. Single cell provides expression measurements for individual cells in a collection of cells from a particular tissue sample. Hence, it can yield feature vectors consisting of higher order and mixed moments. This article demonstrates the advantage of using these expression moments in cancer-related classification. We use synthetic data generated from 2 real networks, the mammalian cell cycle network and a melanoma-related pathway network, and real single-cell data generated via fluorescent protein reporters from 2 cell lines, HT-29 and HCT-116. The networks consist of hidden binary regulatory networks with Gaussian observations. The steady-state distributions of both the original and mutated networks are found, and data are drawn from these for moment-based classification using the mean, variance, skewness, and mixed moments. For the real data, we only observe 1 gene at a time, so that only the mean, variance, and skewness are considered, the analysis being done for 2 genes, EGFR and ERRB2. For the synthetic data, classification improves as we move from just the mean to mean, variance, and skewness and then to these plus the mixed moments. Comparisons are done with 3, 4, or 5 features, using feature selection. Sample size effects are considered. For the real data, we only consider mean, variance, and skewness, with results improving when the higher order moments are used as features.
机译:标准表达微阵列和RNA-Seq分类的功能是整个细胞集合的表达平均值。单细胞为来自特定组织样品的细胞集合中的单个细胞提供表达测量。因此,它可以产生由较高阶和混合矩组成的特征向量。本文展示了在癌症相关分类中使用这些表达时机的优势。我们使用从2个真实网络(哺乳动物细胞周期网络和黑色素瘤相关途径网络)生成的合成数据,以及通过2种细胞系HT-29和HCT-116的荧光蛋白报告基因生成的真实单细胞数据。该网络由具有高斯观测值的隐藏的二进制监管网络组成。找到原始网络和变异网络的稳态分布,并使用均值,方差,偏度和混合矩从这些数据中提取数据以进行基于矩的分类。对于真实数据,我们一次只观察一个基因,因此仅考虑均值,方差和偏度,对2个基因EGFR和ERRB2进行了分析。对于综合数据,随着我们从均值变为均值,方差和偏度,再到这些加混合矩,分类得到了改善。使用功能选择对3、4或5个功能进行比较。考虑样本量的影响。对于真实数据,我们仅考虑均值,方差和偏度,当将高阶矩用作特征时,结果会改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号