首页> 外文期刊>Journal of computational biology >Resampling-Based Similarity Measures for High-Dimensional Data
【24h】

Resampling-Based Similarity Measures for High-Dimensional Data

机译:基于重采样的高维数据相似性度量

获取原文
           

摘要

Abstract An important issue in classification is the assessment of sample similarity. This is nontrivial in high-dimensional or megavariate datasets—datasets that are comprised of simultaneous measurements on thousands of features, many of which carry little or no information regarding consistent sample differences. Conventional similarity measures do not work particularly well for such data. As an alternative, we propose a distance measure that is based on a refiltering process: at each step of the process a random subset of features is selected and a cluster analysis is performed using only this subset; the relative frequency with which a pair of samples clusters together across several such random subsets forms the similarity measure. The features chosen at any step may be completely random or enriched by awarding the more informative features a higher chance of selection; this enrichment turns out to be particularly effective. We use actual datasets from the burgeoning genomics literature to demonstra..." /> rel="meta" type="application/atom+xml" href="http://dx.doi.org/10.1089%2Fcmb.2014.0195" /> rel="meta" type="application/rdf+json" href="http://dx.doi.org/10.1089%2Fcmb.2014.0195" /> rel="meta" type="application/unixref+xml" href="http://dx.doi.org/10.1089%2Fcmb.2014.0195" /> 展开▼
机译:摘要分类中的一个重要问题是样本相似性的评估。在高维或大型变量数据集中,这是不平凡的-数据集由对数千个特征的同时测量组成,其中许多特征很少或没有关于一致样本差异的信息。常规相似性度量不适用于此类数据。作为替代方案,我们提出一种基于重新过滤过程的距离度量:在过程的每个步骤中,选择特征的随机子集,仅使用该子集执行聚类分析;一对样本跨几个这样的随机子集聚在一起的相对频率形成相似性度量。在任何步骤选择的特征可能是完全随机的,也可以通过授予更多信息来选择更多机会来丰富。事实证明,这种浓缩特别有效。我们使用来自新兴基因组学文献的实际数据集进行演示...“ /> <元名称=” dc.Identifier“ scheme =” publisher-id“ content =” 10.1089 / cmb.2014.0195“ /> <元名称=” dc.Identifier“ scheme =” doi“ content =” 10.1089 / cmb.2014.0195“ /> rel =” meta“ type =” application / atom + xml“ href =” http://dx.doi.org/10.1089%2Fcmb.2014.0195“ /> <链接rel =“ meta” ty pe =“ application / rdf + json” href =“ http://dx.doi.org/10.1089%2Fcmb.2014.0195” /> rel =“ meta” type =“ application / unixref + xml” href =“ http ://dx.doi.org/10.1089%2Fcmb.2014.0195“ /> <元名称=” MSSmartTagsPreventParsing“ content =” true

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号