...
首页> 外文期刊>Nucleosides, nucleotides and nucleic acids >Bacterial classification with convolutional neural networks based on different data reduction layers
【24h】

Bacterial classification with convolutional neural networks based on different data reduction layers

机译:基于不同数据缩减层的卷积神经网络细菌分类

获取原文
获取原文并翻译 | 示例

摘要

For high accuracy classification of DNA sequences through Convolutional Neural Networks (CNNs), it is essential to use an efficient sequence representation that can accelerate similarity comparison between DNA sequences. In addition, CNN networks can be improved by avoiding the dimensionality problem associated with multi-layer CNN features. This paper presents a new approach for classification of bacterial DNA sequences based on a custom layer. A CNN is used with Frequency Chaos Game Representation (FCGR) of DNA. The FCGR is adopted as a sequence representation method with a suitable choice of the frequency k-lengthen words occurrence in DNA sequences. The DNA sequence is mapped using FCGR that produces an image of a gene sequence. This sequence displays both local and global patterns. A pre-trained CNN is built for image classification. First, the image is converted to feature maps through convolutional layers. This is sometimes followed by a down-sampling operation that reduces the spatial size of the feature map and removes redundant spatial information using the pooling layers. The Random Projection (RP) with an activation function, which carries data with a decent variety with some randomness, is suggested instead of the pooling layers. The feature reduction is achieved while keeping the high accuracy for classifying bacteria into taxonomic levels. The simulation results show that the proposed CNN based on RP has a trade-off between accuracy score and processing time.
机译:为了通过卷积神经网络 (CNN) 对 DNA 序列进行高精度分类,必须使用可以加速 DNA 序列之间相似性比较的高效序列表示。此外,通过避免与多层CNN特征相关的维度问题,可以改进CNN网络。本文提出了一种基于自定义层的细菌DNA序列分类新方法。CNN 与 DNA 的频率混沌博弈表示 (FCGR) 一起使用。采用FCGR作为序列表示方法,适当选择DNA序列中k-lengthen字出现的频率。使用 FCGR 绘制 DNA 序列,生成基因序列的图像。此序列同时显示局部和全局模式。为图像分类构建了预训练的 CNN。首先,通过卷积层将图像转换为特征图。有时,接下来是下采样操作,以减小特征图的空间大小,并使用池化层删除冗余空间信息。建议使用具有激活函数的随机投影 (RP),该函数携带的数据具有一定的随机性,而不是池化层。在实现特征减少的同时,保持了将细菌分类为分类水平的高精度。仿真结果表明,所提出的基于RP的CNN在准确率得分和处理时间之间进行了权衡。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号