首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >Long short-term memory recurrent neural networks for antibacterial peptide identification
【24h】

Long short-term memory recurrent neural networks for antibacterial peptide identification

机译:长短期记忆递归神经网络用于抗菌肽鉴定

获取原文

摘要

Antimicrobial peptides are short amino acid sequences with antibacterial, antifungal, and antiviral properties. Antibacterial peptides have the possibility to form a new class of antibiotics to aid in combating bacterial antibiotic resistance. Most machine learning methodologies applied to the task of identifying antimicrobial peptides have applied features representing the presence or absence of certain periodic patterns in the amino acid sequence. This requires considering different periodicities for each feature and leads to a large number of features many of which are likely irrelevant to the classification task at hand. Also as the peptides vary in length it is difficult to develop a feature vector of identical finite length representing all the sequences. An easy way to circumvent both of these problems is provided by recurrent neural networks. In this work we choose to extract a feature vector through the application of bidirectional Long Short-Term Memory (LSTM) recurrent neural networks from features representing individual amino acids within each sequence. The LSTM network recursively iterates along both directions of each amino acid sequence and extracts two finite vectors whose concatenation yields the finite length vector representation of the amino acid sequence. As this is done during the training of the network on the classification task, the representation extracted is more likely to be relevant for distinguishing between the classes. This work demonstrates the LSTM approach to classification of antibacterial peptides and compares it to a Random Forest classifier and a k-nearest neighbor classifier.
机译:抗菌肽是具有抗菌,抗真菌和抗病毒特性的短氨基酸序列。抗菌肽可能会形成一类新的抗生素,以帮助抵抗细菌的抗生素耐药性。应用于识别抗菌肽的任务的大多数机器学习方法已应用了代表氨基酸序列中某些周期性模式存在或不存在的特征。这需要为每个特征考虑不同的周期性,并导致大量特征,其中许多可能与手头的分类任务无关。而且,由于肽的长度变化,因此难以开发代表所有序列的相同有限长度的特征向量。递归神经网络提供了一种规避这两个问题的简便方法。在这项工作中,我们选择通过应用双向长短期记忆(LSTM)递归神经网络从代表每个序列中各个氨基酸的特征中提取特征向量。 LSTM网络沿每个氨基酸序列的两个方向进行递归迭代,并提取两个有限向量,这些向量的连接产生了氨基酸序列的有限长度向量表示。由于这是在训练网络进行分类任务时完成的,因此提取的表示形式更可能与区分类别有关。这项工作演示了LSTM方法对抗菌肽进行分类,并将其与随机森林分类器和k近邻分类器进行了比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号