Hierarchical bi-directional attention-based RNNs for supporting document classification on protein–protein interactions affected by genetic mutations

机译：基于分层双向注意的RNN支持受基因突变影响的蛋白质间相互作用的文档分类

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we describe a hierarchical bi-directional attention-based Re-current Neural Network (RNN) as a reusable sequence encoder architecture, which is used as sentence and document encoder for document classification. The sequence encoder is composed of two bi-directional RNN equipped with an attention mechanism that identifies and captures the most important elements, words or sentences, in a document followed by a dense layer for the classification task. Our approach utilizes the hierarchical nature of documents which are composed of sequences of sentences and sentences are composed of sequences of words. In our model, we use word embeddings to project the words to a low-dimensional vector space. We leverage word embeddings trained on PubMed for initializing the embedding layer of our network. We apply this model to biomedical literature specifically, on paper abstracts published in PubMed. We argue that the title of the paper itself usually contains important information more salient than a typical sentence in the abstract. For this reason, we propose a shortcut connection that integrates the title vector representation directly to the final feature representation of the document. We concatenate the sentence vector that represents the title and the vectors of the abstract to the document feature vector used as input to the task classifier. With this system we participated in the Document Triage Task of the BioCreative VI Precision Medicine Track and we achieved 0.6289 Precision, 0.7656 Recall and 0.6906 F1-score with the Precision and F1-score be the highest ranking first among the other systems.Database URL:

机译：在本文中，我们将基于分层双向注意力的递归神经网络（RNN）描述为可重用的序列编码器体系结构，该体系结构用作句子和文档编码器进行文档分类。序列编码器由两个双向RNN组成，配备有注意机制，该机制识别并捕获文档中最重要的元素，单词或句子，然后是用于分类任务的密集层。我们的方法利用了由句子序列组成的文档的层次性质，而句子由单词序列组成。在我们的模型中，我们使用词嵌入将词投影到低维向量空间。我们利用在PubMed上训练的词嵌入来初始化网络的嵌入层。我们将这种模型专门应用于生物医学文献，发表在PubMed上的论文摘要上。我们认为论文的标题通常包含比摘要中的典型句子更重要的重要信息。因此，我们提出了一种快捷方式连接，该快捷方式将标题矢量表示直接集成到文档的最终特征表示中。我们将代表标题的句子向量和摘要的向量连接到用作任务分类器输入的文档特征向量。通过该系统，我们参加了BioCreative VI精确医学轨道的文档分类任务，获得了0.6289精度，0.7656召回率和0.6906 F1分数，其中Precision和F1分数在其他系统中排名最高。

著录项

期刊名称 Database: The Journal of Biological Databases and Curation
作者
Aris Fergadis; Christos Baziotis; Dimitris Pappas; Haris Papageorgiou; Alexandros Potamianos;
展开▼
作者单位

展开▼
年(卷),期 2018(2018),-1
年度 2018
页码 bay076
总页数 10
原文格式 PDF
正文语种
中图分类生物学;
关键词

相似文献

外文文献
中文文献
专利

1. Depot- and obesity-related differences in adipogenesisAdipocyte hypertrophy and hyperplasia are known to facilitate lipid storage in adipose tissues by increasing adipocyte cell size and number, respectively. Adipogenesis is the process resulting in adipose tissue hyperplasia. Although depot-specific differences and obesity-related modulation of adipocyte size are well documented, available data on adipogenesis and adipose tissue hyperplasia are less conclusive. Most studies support a reduction of adipogenesis in the obese state. Preadipocytes of the subcutaneous fat depot appear to be more responsive to adipogenic stimulation compared with those from visceral fat compartments in most studies. A number of studies support the notion that adipose tissue expansion through hyperplasia reduces ectopic lipid excess and obesity-related complications. Several genetic variants have been identified in the genes coding for adipogenesis-regulating proteins. While some of these variants have been clearly associated with the phenotypes of obesity and obesity-related alterations, available data highlight the importance of considering gene–gene and gene–diet interactions. [J] . Julie. Lessard, André. Tchernof Clinical lipidology. . 2012,第5期

机译：脂肪形成与肥胖相关的差异已知脂肪细胞肥大和增生分别通过增加脂肪细胞的大小和数量来促进脂质在脂肪组织中的存储。脂肪形成是导致脂肪组织增生的过程。尽管已经有很多文献记载了贮库特异性差异和肥胖相关的脂肪细胞大小调节，但有关脂肪形成和脂肪组织增生的可用数据尚无定论。大多数研究支持在肥胖状态下减少脂肪形成。在大多数研究中，与来自内脏脂肪区室的脂肪细胞相比，皮下脂肪库的前脂肪细胞似乎对脂肪刺激更为敏感。许多研究支持这样的观点，即通过增生的脂肪组织扩张可以减少异位脂质过多和肥胖相关的并发症。在编码脂肪形成调节蛋白的基因中已经鉴定出几种遗传变异。尽管其中一些变异与肥胖症的表型和与肥胖有关的改变明显相关，但现有数据突出了考虑基因-基因和基因-饮食相互作用的重要性。
2. Document classification for mining host pathogen protein-protein interactions [J] . Lanlan Yin, Guixian Xu, Manabu Torii, Artificial intelligence in medicine . 2010,第3期

机译：挖掘宿主病原体蛋白质-蛋白质相互作用的文献分类
3. Suppression and synthetic-lethal genetic relationships of gpsB mutations indicate that GpsB mediates protein phosphorylation and penicillin-binding protein interactions in Streptococcus pneumoniae D39 [J] . Rued Britta E., Zheng Jiaqi J., Mura Andrea, Molecular Microbiology . 2017,第6期

机译：GPSB突变的抑制和合成致死的遗传关系表明，GPSB在链球菌D39中介导蛋白质磷酸化和青霉素结合蛋白相互作用
4. Document Classification for Mining Host Pathogen Protein-Protein Interactions [C] . Xu Guixian, Yin Lanlan, Torii Manabu, IEEE International Conference on Bioinformatics and Biomedicine . 2008

机译：采矿宿主病原体蛋白质相互作用的文献分类
5. Structural and functional analysis of bone morphogenetic proteins: Crystal structure of bone morphogenetic protein-9, binding studies with pro-domain and receptors, and mutational studies in Drosophila decapentaplegic [D] . Brown, Monica Anne 2008

机译：骨形态发生蛋白9的结构和功能分析：骨形态发生蛋白9的晶体结构，与前域和受体的结合研究以及果蝇十足性果蝇的突变研究
6. Document triage for identifying protein–protein interactions affected by mutations: a neural network ensemble approach [O] . Ling Luo, Zhihao Yang, Hongfei Lin, 2018

机译：用于识别受突变影响的蛋白质间相互作用的文献分类：一种神经网络集成方法
7. Hierarchical bi-directional attention-based RNNs for supporting document classification on protein–protein interactions affected by genetic mutations [O] . Aris Fergadis, Christos Baziotis, Dimitris Pappas, 2018

机译：基于分层双向关注的RNN，用于支持受遗传突变影响的蛋白质 - 蛋白质相互作用的文献分类

Hierarchical bi-directional attention-based RNNs for supporting document classification on protein–protein interactions affected by genetic mutations

摘要

著录项

相似文献

相关主题

期刊订阅