Privacy Preserving Principal Component Analysis Clustering for Distributed Heterogeneous Gene Expression Datasets

Xin Li

首页> 外文期刊>International journal of computational models and algorithms in medicine. >Privacy Preserving Principal Component Analysis Clustering for Distributed Heterogeneous Gene Expression Datasets

【24h】

Privacy Preserving Principal Component Analysis Clustering for Distributed Heterogeneous Gene Expression Datasets

机译：分布式异构基因表达数据集的隐私保护主成分分析聚类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we present approaches to perform principal component analysis (PCA) clustering for distributed heterogeneous genomic datasets with privacy protection. The approaches allow data providers to collaborate together to identify gene profiles from a global viewpoint, and at the same time, protect the sensitive genomic data from possible privacy leaks. We then further develop a framework for privacy preserving PCA-based gene clustering, which includes two types ofparticipants: data providers and a trusted central site (TCS). Two different methodologies are employed: Collective PCA (C-PCA) and Repeating PCA (R-PCA). The C-PCA requires local sites to transmit a sample of original data to the TCS and can be applied to any heterogeneous datasets. The R-PCA approach requires all local sites have the same or similar number of columns, but releases no original data. Experiments on five independent genomic datasets show that both C-PCA and R-PCA approaches maintain very good accuracy compared with the centralized scenario.

机译：在本文中，我们介绍了对具有隐私保护功能的分布式异构基因组数据集执行主成分分析（PCA）聚类的方法。这些方法使数据提供者可以共同协作，从全局的角度识别基因概况，同时保护敏感的基因组数据免受可能的隐私泄露。然后，我们进一步开发了一个用于基于PCA的隐私保护基因聚类的框架，该框架包括两种类型的参与者：数据提供者和受信任的中心站点（TCS）。使用两种不同的方法：集体PCA（C-PCA）和重复PCA（R-PCA）。 C-PCA要求本地站点将原始数据的样本传输到TCS，并且可以应用于任何异构数据集。 R-PCA方法要求所有本地站点具有相同或相似数量的列，但不释放原始数据。在五个独立的基因组数据集上进行的实验表明，与集中式方案相比，C-PCA和R-PCA方法都保持了非常好的准确性。

著录项

来源
《International journal of computational models and algorithms in medicine.》 |2011年第4期|共34页
作者
Xin Li;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Clustering; Gene Profiling; Human Genomic Data; Principal Component Analysis; Privacy Preserving Data Mining; Vertical Partitioning;

机译：聚类;基因分析;人类基因组数据;主成分分析;隐私保护数据挖掘;垂直分区;

相似文献

外文文献
中文文献
专利

1. Privacy Preserving Principal Component Analysis Clustering for Distributed Heterogeneous Gene Expression Datasets [J] . Xin Li International journal of computational models and algorithms in medicine. . 2011,第4期

机译：分布式异构基因表达数据集的隐私保护主成分分析聚类
2. A privacy preserving clustering technique for horizontally and vertically distributed datasets [J] . Sara Hajian, Mohammad Abdollahi Azgomi Intelligent data analysis . 2011,第4期

机译：水平和垂直分布数据集的隐私保护聚类技术
3. PRACTICAL APPLICATION OF A COMBINATION OF PRINCIPAL COMPONENT ANALYSIS AND CLUSTER ANALYSIS TO EVALUATE THE PARAMETERS OF LAYERED HETEROGENEOUS FORMATIONS IN DAGANG OIL FIELD [J] . Zhang Yi, Zhai Ying-hu, Sun Teng-fei, Chemistry and Technology of Fuels and Oils . 2015,第6期

机译：主成分分析与聚类分析相结合来评价大港油田非均质层状参数的实际应用
4. An Analysis of Clustering Approaches to Distributed Learning on Heterogeneously Distributed Datasets [C] . Diego Peteiro-Barral, Bertha Guijarro-Berdinas Annual KES conference . 2012

机译：异构数据集上分布式学习的聚类方法分析
5. Towards scalable and privacy-preserving integration of distributed heterogeneous data. [D] . Jurczyk, Pawel. 2010

机译：致力于分布式异构数据的可伸缩性和隐私保护集成。
6. hcapca: Automated Hierarchical Clustering and Principal Component Analysis of Large Metabolomic Datasets in R [O] . Shaurya Chanana, Chris S. Thomas, Fan Zhang, 2020

机译：HCAPCA：r的大型代谢组数据集的自动分层聚类和主成分分析
7. Collective Principal Component Analysis from Distributed, Heterogeneous Data [O] . Hillol Kargupta, Weiyun Huang, Krishnamoorthy Sivakumar, 2000

机译：分布式，异构数据的集体主成分分析
8. Unsupervised Learning Approach for Facial Expression Recognition using Semi-Definite Programming and Generalized Principal Component Analysis [R] . Gholami, B., Haddad, W. M., Tannenbaum, A. R. 2010

机译：基于半定规划和广义主成分分析的面部表情识别无监督学习方法

Privacy Preserving Principal Component Analysis Clustering for Distributed Heterogeneous Gene Expression Datasets

摘要

著录项

相似文献

相关主题

期刊订阅