...
首页> 外文期刊>Bioinformatics >An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data.
【24h】

An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data.

机译:在微阵列数据中用于癌症类别发现和标记基因识别的无监督分层动态自组织方法。

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

MOTIVATION: Current Self-Organizing Maps (SOMs) approaches to gene expression pattern clustering require the user to predefine the number of clusters likely to be expected. Hierarchical clustering methods used in this area do not provide unique partitioning of data. We describe an unsupervised dynamic hierarchical self-organizing approach, which suggests an appropriate number of clusters, to perform class discovery and marker gene identification in microarray data. In the process of class discovery, the proposed algorithm identifies corresponding sets of predictor genes that best distinguish one class from other classes. The approach integrates merits of hierarchical clustering with robustness against noise known from self-organizing approaches. RESULTS: The proposed algorithm applied to DNA microarray data sets of two types of cancers has demonstrated its ability to produce the most suitable number of clusters. Further, the corresponding marker genes identified through the unsupervised algorithm also have a strong biological relationship to the specific cancer class. The algorithm tested on leukemia microarray data, which contains three leukemia types, was able to determine three major and one minor cluster. Prediction models built for the four clusters indicate that the prediction strength for the smaller cluster is generally low, therefore labelled as uncertain cluster. Further analysis shows that the uncertain cluster can be subdivided further, and the subdivisions are related to two of the original clusters. Another test performed using colon cancer microarray data has automatically derived two clusters, which is consistent with the number of classes in data (cancerous and normal). AVAILABILITY: JAVA software of dynamic SOM tree algorithm is available upon request for academic use. Supplementary information: A comparison of rectangular and hexagonal topologies for GSOM is available from http://www.mame.mu.oz.au/mechatronics/journalinfo/Hsu2003supp.pdf
机译:动机:当前用于基因表达模式聚类的自组织图(SOM)方法要求用户预先定义可能预期的聚类数量。此区域中使用的分层聚类方法不能提供唯一的数据分区。我们描述了一种无监督的动态分层自组织方法,该方法建议了适当数量的簇,以在微阵列数据中执行类发现和标记基因识别。在类别发现的过程中,所提出的算法识别出最好地将一个类别与其他类别区分开的相应预测因子基因集。该方法结合了层次聚类的优点和对自组织方法已知的噪声的鲁棒性。结果:拟议的算法应用于两种类型的癌症的DNA微阵列数据集已证明其能够产生最合适数量的簇。此外,通过无监督算法识别的相应标记基因也与特定癌症类别具有很强的生物学关系。在包含三种白血病类型的白血病微阵列数据上测试的算法能够确定三个主要和一个次要簇。为四个聚类建立的预测模型表明,较小聚类的预测强度通常较低,因此标记为不确定聚类。进一步的分析表明,不确定聚类可以进一步细分,并且细分与两个原始聚类有关。使用结肠癌微阵列数据进行的另一项测试自动得出了两个簇,这与数据类别(癌性和正常性)一致。可用性:可根据要求提供用于动态SOM树算法的JAVA软件。补充信息:可从http://www.mame.mu.oz.au/mechatronics/journalinfo/Hsu2003supp.pdf获得GSOM的矩形和六角形拓扑的比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号