首页> 外文期刊>Computational statistics & data analysis >Identifying outliers using multiple kernel canonical correlation analysis with application to imaging genetics
【24h】

Identifying outliers using multiple kernel canonical correlation analysis with application to imaging genetics

机译:使用多个内核规范相关分析识别异常值,其应用于成像遗传学

获取原文
获取原文并翻译 | 示例
           

摘要

Identifying significant outliers or atypical objects from multimodal datasets is an essential and challenging issue for biomedical research. This problem is addressed, using the influence function of multiple kernel canonical correlation analysis. First, the influence function (IF) of the kernel mean element, the kernel covariance operator, the kernel cross-covariance operator and kernel canonical correlation analysis (kernel CCA) are studied. Second, an IF of multiple kernel CCA is proposed, which can be applied to multimodal datasets. Third, a visualization method is proposed to detect influential observations of multiple sources of data based on the IF of kernel CCA and multiple kernel CCA. Finally, to validate the method, experiments on both synthesized and imaging genetics data (e.g., SNP, fMRI, and DNA methylation) are performed. To examine the outliers, both the stem-and-leaf display and distribution based technique are used. The performance of the proposed approach is illustrated on 116 candidate regions of interest (ROIs) from the fMRI data of schizophrenia study to identify significant ROIs. The proposed method and two state-of-the-art statistical methods have identified 8, 34, and 10 ROIs, respectively. Based on an online database, the brain mappings of the selected common 7 ROIs indicate the irregular brain regions susceptible to schizophrenia. The results demonstrate that the proposed method is capable of analyzing outliers and the influence of observations, and can be applicable to many other biomedical data which are often high-dimensional and multi-modal. (C) 2018 Elsevier B.V. All rights reserved.
机译:识别来自多模式数据集的重要异常值或非典型对象是生物医学研究的重要和具有挑战性的问题。使用多个内核规范相关分析的影响功能来解决这个问题。首先,研究了内核均值,内核协方差运算符,内核交叉协方差运算符和内核规范相关分析(内核CCA)的影响函数(IF)。其次,提出了一个IF的IF,可以应用于多模式数据集。第三,提出了一种可视化方法,以检测基于内核CCA和多个内核CCA的多个数据源的影响观察。最后,为了验证方法,进行合成和成像遗传学数据(例如,SNP,FMRI和DNA甲基化)的实验。为了检查异常值,使用阀杆和叶片显示和基于分布的技术。从精神分裂症研究的FMRI数据中的116名候选人(ROI)的116个候选地区(ROI)来说明了所提出的方法,以识别重要的ROI。所提出的方法和两种最新的统计方法分别识别出8,34和10个ROI。基于在线数据库,所选共同的7 ROI的大脑映射表明了易患精神分裂症的不规则脑区。结果表明,所提出的方法能够分析异常值和观察的影响,并且可以适用于许多通常是高维和多模态的其他生物医学数据。 (c)2018 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号