首页> 外文会议>2017 International Conference on Computer and Drone Applications >Evaluation of convolutionary neural networks modeling of DNA sequences using ordinal versus one-hot encoding method
【24h】

Evaluation of convolutionary neural networks modeling of DNA sequences using ordinal versus one-hot encoding method

机译:使用序数编码与单热编码方法评估DNA序列的卷积神经网络建模

获取原文
获取原文并翻译 | 示例

摘要

Convolutionary neural network (CNN) is a popular choice for supervised DNA motif prediction due to its excellent performances. To employ CNN, the input DNA sequences are required to be encoded as numerical values and represented as either vectors or multi-dimensional matrices. This paper evaluated a simple and more compact ordinal encoding method versus the popular one-hot encoding for DNA sequences. We compared the performances of both encoding methods using three sets of datasets enriched with DNA motifs. We found that the ordinal encoding performs comparable to the one-hot method but with significant reduction in training time. In addition, the one-hot encoding performances were rather consistent across various datasets but would require suitable CNN configuration to perform well. The ordinal encoding with matrix representation performed best in some of the evaluated datasets. This study implied that the performances of CNN for DNA motif discovery depends on the suitable design of the sequence encoding and representation. The good performances of the ordinal encoding method demonstrates that there are still rooms for improvement for the one-hot encoding method.
机译:卷积神经网络(CNN)由于其出色的性能而成为有监督的DNA基序预测的流行选择。为了使用CNN,要求将输入的DNA序列编码为数值,并表示为矢量或多维矩阵。本文评估了一种简单而紧凑的序数编码方法,而不是流行的DNA序列一键编码。我们使用三组富含DNA图案的数据集比较了两种编码方法的性能。我们发现,序数编码的性能可与单步编码法相媲美,但训练时间却大大减少。此外,单热编码性能在各种数据集上相当一致,但需要适当的CNN配置才能良好地执行。在某些评估的数据集中,矩阵表示的序数编码表现最佳。这项研究表明,CNN用于DNA基序发现的性能取决于序列编码和表示的适当设计。顺序编码方法的良好性能表明,单热编码方法仍有改进的空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号