...
首页> 外文期刊>BMC Bioinformatics >Phenotype Recognition with Combined Features and Random Subspace Classifier Ensemble
【24h】

Phenotype Recognition with Combined Features and Random Subspace Classifier Ensemble

机译:具有组合特征和随机子空间分类器集合的表型识别

获取原文

摘要

Background Automated, image based high-content screening is a fundamental tool for discovery in biological science. Modern robotic fluorescence microscopes are able to capture thousands of images from massively parallel experiments such as RNA interference (RNAi) or small-molecule screens. As such, efficient computational methods are required for automatic cellular phenotype identification capable of dealing with large image data sets. In this paper we investigated an efficient method for the extraction of quantitative features from images by combining second order statistics, or Haralick features, with curvelet transform. A random subspace based classifier ensemble with multiple layer perceptron (MLP) as the base classifier was then exploited for classification. Haralick features estimate image properties related to second-order statistics based on the grey level co-occurrence matrix (GLCM), which has been extensively used for various image processing applications. The curvelet transform has a more sparse representation of the image than wavelet, thus offering a description with higher time frequency resolution and high degree of directionality and anisotropy, which is particularly appropriate for many images rich with edges and curves. A combined feature description from Haralick feature and curvelet transform can further increase the accuracy of classification by taking their complementary information. We then investigate the applicability of the random subspace (RS) ensemble method for phenotype classification based on microscopy images. A base classifier is trained with a RS sampled subset of the original feature set and the ensemble assigns a class label by majority voting. Results Experimental results on the phenotype recognition from three benchmarking image sets including HeLa, CHO and RNAi show the effectiveness of the proposed approach. The combined feature is better than any individual one in the classification accuracy. The ensemble model produces better classification performance compared to the component neural networks trained. For the three images sets HeLa, CHO and RNAi, the Random Subspace Ensembles offers the classification rates 91.20%, 98.86% and 91.03% respectively, which compares sharply with the published result 84%, 93% and 82% from a multi-purpose image classifier WND-CHARM which applied wavelet transforms and other feature extraction methods. We investigated the problem of estimation of ensemble parameters and found that satisfactory performance improvement could be brought by a relative medium dimensionality of feature subsets and small ensemble size. Conclusions The characteristics of curvelet transform of being multiscale and multidirectional suit the description of microscopy images very well. It is empirically demonstrated that the curvelet-based feature is clearly preferred to wavelet-based feature for bioimage descriptions. The random subspace ensemble of MLPs is much better than a number of commonly applied multi-class classifiers in the investigated application of phenotype recognition.
机译:背景技术基于图像的自动化高内涵筛选是生物科学发现的基本工具。现代机器人荧光显微镜能够从大规模平行实验(例如RNA干扰(RNAi)或小分子屏幕)捕获数千张图像。这样,对于能够处理大图像数据集的自动细胞表型识别,需要有效的计算方法。在本文中,我们研究了通过结合二阶统计量或Haralick特征与Curvelet变换从图像中提取定量特征的有效方法。然后,利用多层感知器(MLP)作为基础分类器的基于随机子空间的分类器集合进行分类。 Haralick功能基于灰度共现矩阵(GLCM)估计与二阶统计量有关的图像属性,该矩阵已广泛用于各种图像处理应用程序。与小波相比,curvelet变换对图像的表示更为稀疏,因此提供了具有更高时频分辨率,高度方向性和各向异性的描述,这尤其适用于许多具有许多边缘和曲线的图像。 Haralick特征和Curvelet变换的组合特征描述可以通过获取互补信息来进一步提高分类的准确性。然后,我们调查基于显微镜图像的表型分类的随机子空间(RS)集成方法的适用性。基本分类器使用原始特征集的RS采样子集进行训练,并且集合通过多数投票分配类别标签。结果从三个基准图像集(包括HeLa,CHO和RNAi)进行的表型识别实验结果证明了该方法的有效性。组合功能在分类精度方面优于任何单个功能。与训练的组件神经网络相比,集成模型产生更好的分类性能。对于HeLa,CHO和RNAi这三个图像集,随机子空间集合分别提供91.20%,98.86%和91.03%的分类率,与多用途图像的已发布结果分别有84%,93%和82%形成鲜明对比分类器WND-CHARM,它应用了小波变换和其他特征提取方法。我们研究了集合参数的估计问题,发现特征子集的相对中等尺寸和较小的集合大小可以带来令人满意的性能改进。结论Curvelet变换具有多尺度,多方向的特征,非常适合显微镜图像的描述。实验证明,对于生物图像描述,基于曲波的特征明显优于基于小波的特征。在研究的表型识别应用中,MLP的随机子空间集合比许多常用的多类分类器要好得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号