首页> 外文期刊>Knowledge-Based Systems >DECAB-LSTM: Deep Contextualized Attentional Bidirectional LSTM for cancer hallmark classification
【24h】

DECAB-LSTM: Deep Contextualized Attentional Bidirectional LSTM for cancer hallmark classification

机译:Decab-LSTM:癌症符号分类的深层环境化预防双向LSTM

获取原文
获取原文并翻译 | 示例

摘要

The great number of online scientific publications on cancer research makes large scale data mining possible. The hallmarks or characteristics of cancer can be used to distinguish cancerous cells from normal cells. Therefore, it is extremely necessary to organize and categorize a sea of scientific articles into the corresponding hallmarks by predicting whether or not they contain the information of interest. In the past, many research works tended to employ traditional machine learning methods that characterize feature engineering. Deep learning-based methods have achieved state-of-the-art performance in a wide range of Natural Language Processing (NLP) tasks. However, there is only a limited number of work with a focus on deep learning techniques for the task of cancer hallmark text classification. To advance this task, a novel neural architecture DEep Contextualized Attentional Bidirectional LSTM (DECAB-LSTM) was proposed, capable of learning to attend to the valuable information in a sentence by introducing contextual attention mechanism. We also investigated the effect of a good word embedding for the cancer hallmark text classification. We trained our model on a benchmark dataset and reported the accuracy, f score, and AUC metrics. Compared to several baselines like Logistic regression, Support Vector Machines, Convolutional Neural Networks, fastText, etc., the proposed model have achieved state-of-the-art performance over baselines, demonstrating its great potential in the empirical application to cancer research. (C) 2020 Elsevier B.V. All rights reserved.
机译:大量的在线科学出版物对癌症研究制造了大规模的数据挖掘。癌症的标志或特征可用于区分癌细胞与正常细胞。因此,通过预测它们是否包含感兴趣的信息,非常有必要将科学文章的海洋组织和分类为相应的标志。在过去,许多研究工作往往采用传统的机器学习方法,这些方法表征了特色工程。基于深度学习的方法在广泛的自然语言处理(NLP)任务中取得了最先进的性能。然而,只有有限数量的工作,重点是癌症符号文本分类任务的深度学习技术。为了推进这项任务,提出了一种新型神经结构深层环境化的预防双向LSTM(Decab-LSTM),能够通过引入语境关注机制来学习判决中的有价值信息。我们还调查了对癌症符号文本分类的良好词嵌入的效果。我们在基准数据集中培训了我们的模型,并报告了准确性,F分数和AUC度量。与逻辑回归等几个基线相比,支持向量机,卷积神经网络,FastText等,所提出的模型对基线实现了最先进的性能,证明了其对癌症研究的实证应用的巨大潜力。 (c)2020 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号