Robust Rough-Fuzzy C-Means Algorithm: Design and Applications in Coding and Non-coding RNA Expression Data Clustering

Pradipta Maji; Sushmita Paul

首页> 外文期刊>Fundamenta Informaticae >Robust Rough-Fuzzy C-Means Algorithm: Design and Applications in Coding and Non-coding RNA Expression Data Clustering

【24h】

Robust Rough-Fuzzy C-Means Algorithm: Design and Applications in Coding and Non-coding RNA Expression Data Clustering

机译：鲁棒的粗糙模糊C均值算法：在编码和非编码RNA表达数据聚类中的设计和应用

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Cluster analysis is a technique that divides a given data set into a set of clusters in such a way that two objects from the same cluster are as similar as possible and the objects from different clusters are as dissimilar as possible. In this background, different rough-fuzzy clustering algorithms have been shown to be successful for finding overlapping and vaguely defined clusters. However, the crisp lower approximation of a cluster in existing rough-fuzzy clustering algorithms is usually assumed to be spherical in shape, which restricts to find arbitrary shapes of clusters. In this regard, this paper presents a new rough-fuzzy clustering algorithm, termed as robust rough-fuzzy c-means. Each cluster in the proposed clustering algorithm is represented by a set of three parameters, namely, cluster prototype, a possibilistic fuzzy lower approximation, and a probabilistic fuzzy boundary. The possibilistic lower approximation helps in discovering clusters of various shapes. The cluster prototype depends on the weighting average of the possibilistic lower approximation and probabilistic boundary. The proposed algorithm is robust in the sense that it can find overlapping and vaguely defined clusters with arbitrary shapes in noisy environment. An efficient method is presented, based on Pearson's correlation coefficient, to select initial prototypes of different clusters. A method is also introduced based on cluster validity index to identify optimum values of different parameters of the initialization method and the proposed clustering algorithm. The effectiveness of the proposed algorithm, along with a comparison with other clustering algorithms, is demonstrated on synthetic as well as coding and non-coding RNA expression data sets using some cluster validity indices.

机译：聚类分析是一种将给定数据集分为一组聚类的技术，以使来自同一聚类的两个对象尽可能相似，而来自不同聚类的对象则尽可能不同。在这种背景下，不同的粗糙模糊聚类算法已被证明可以成功地找到重叠且模糊定义的聚类。然而，在现有的粗糙-模糊聚类算法中，簇的清晰的较低近似通常被假定为球形，这限制了寻找簇的任意形状。在这方面，本文提出了一种新的粗糙模糊聚类算法，称为鲁棒粗糙模糊c均值。提出的聚类算法中的每个聚类由一组三个参数表示，即聚类原型，可能模糊下近似和概率模糊边界。可能的较低近似有助于发现各种形状的簇。集群原型取决于可能的下近似和概率边界的加权平均值。所提出的算法在可以在嘈杂的环境中找到具有任意形状的重叠和模糊定义的簇的意义上是鲁棒的。提出了一种基于皮尔森相关系数的有效方法，用于选择不同聚类的初始原型。还提出了一种基于聚类有效性指标的方法，用于识别初始化方法和所提出的聚类算法不同参数的最优值。使用一些聚类有效性指标，在合成的以及编码的和非编码的RNA表达数据集上都证明了所提出算法的有效性以及与其他聚类算法的比较。

著录项

来源
《Fundamenta Informaticae》 |2013年第2期|153-174|共22页
作者
Pradipta Maji; Sushmita Paul;
展开▼
作者单位

Machine Intelligence Unit, Indian Statistical Institute, 203 B. T. Road, Kolkata, 700 108, India;

Machine Intelligence Unit Indian Statistical Institute 203 B. T. Road, Kolkata, 700 108, India;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Pattern recognition; Clustering; Rough sets; Fuzzy sets; Rough-fuzzy clustering;

机译：模式识别;集群;粗糙集;模糊集;粗模糊聚类;

相似文献

外文文献
中文文献
专利

1. Long Non-Coding RNAs (lncRNAs) of Sea Cucumber: Large-Scale Prediction, Expression Profiling, Non-Coding Network Construction, and lncRNA-microRNA-Gene Interaction Analysis of lncRNAs in Apostichopus japonicus and Holothuria glaberrima During LPS Challenge and Radial Organ Complex Regeneration [J] . Mu Chuang, Wang Ruijia, Li Tianqi, Marine biotechnology . 2016,第4期

机译：海参的长非编码RNA（lncRNA）：在LPS攻击和Rad器官复杂再生过程中，刺参和中华绒螯蟹中lncRNA的大规模预测，表达谱分析，非编码网络构建和lncRNA-microRNA基因相互作用分析。
2. Long Non-coding RNAs Expression Profile in HepG2 Cells Reveals the Potential Role of Long Non-coding RNAs in the Cholesterol Metabolism [J] . Gang Liu, Xinxin Zheng, Yanlu Xu, 中华医学杂志（英文版） . 2015,第001期

机译：HepG2细胞中长非编码RNA的表达谱揭示了胆固醇代谢中长非编码RNA的潜在作用
3. Non-coding RNAs and other determinants of neuroinflammation and endothelial dysfunction: regulation of gene expression in the acute phase of ischemic stroke and possible therapeutic applications [J] . Mario Daidone, Marco Cataldi, Antonio Pinto, Neural regeneration research . 2021,第11期

机译：非编码RNA和神经炎性和内皮功能障碍的其他决定因素：缺血性卒中急性期内基因表达的调节及可能的治疗方法
4. From Alternative Clustering to Robust Clustering and Its Application to Gene Expression Data [C] . Peter Peng, Mohamad Nagi, Omer Sair, Intelligent data engineering and automated learning-IDEAL 2011 . 2011

机译：从替代聚类到鲁棒聚类及其在基因表达数据中的应用
5. Novel algorithms for structural alignment of non-coding RNAs [D] . Kolbe, Diana Lynn. 2010

机译：用于非编码RNA结构比对的新算法
6. A Global Clustering Algorithm to Identify Long Intergenic Non-Coding RNA - with Applications in Mouse Macrophages [O] . Lana X. Garmire, David G. Garmire, Wendy Huang, 2011

机译：全局聚类算法来识别长时间间非编码RNa - 用小鼠巨噬细胞的应用
7. A Global Clustering Algorithm to Identify Long Intergenic Non-Coding RNA - with Applications in Mouse Macrophages [O] . Garmire, Lana X., Garmire, David G., Huang, Wendy, 2011

机译：识别长的基因间非编码RNA的全局聚类算法-在小鼠巨噬细胞中的应用
8. Fuzzy Robust Statistics for Application to the Fuzzy c-Means Clustering Algorithm [R] . Kersten, P. R. 1993

机译：模糊稳健统计量在模糊c-均值聚类算法中的应用

Robust Rough-Fuzzy C-Means Algorithm: Design and Applications in Coding and Non-coding RNA Expression Data Clustering

摘要

著录项

相似文献

相关主题

期刊订阅