首页> 外文期刊>BioMed research international >METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text
【24h】

METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text

机译:METSP:一种基于最大熵分类器的文本挖掘工具,用于半结构化文本的转运基质识别

获取原文
           

摘要

The substrates of a transporter are not only useful for inferring function of the transporter, but also important to discover compound-compound interaction and to reconstruct metabolic pathway. Though plenty of data has been accumulated with the developing of new technologies such asin vitrotransporter assays, the search for substrates of transporters is far from complete. In this article, we introduce METSP, a maximum-entropy classifier devoted to retrieve transporter-substrate pairs (TSPs) from semistructured text. Based on the high quality annotation from UniProt, METSP achieves high precision and recall in cross-validation experiments. When METSP is applied to 182,829 human transporter annotation sentences in UniProt, it identifies 3942 sentences with transporter and compound information. Finally, 1547 confidential human TSPs are identified for further manual curation, among which 58.37% pairs with novel substrates not annotated in public transporter databases. METSP is the first efficient tool to extract TSPs from semistructured annotation text in UniProt. This tool can help to determine the precise substrates and drugs of transporters, thus facilitating drug-target prediction, metabolic network reconstruction, and literature classification.
机译:转运蛋白的底物不仅可用于推断转运蛋白的功能,而且对于发现化合物与化合物的相互作用和重建代谢途径也很重要。尽管随着诸如体外转运蛋白测定等新技术的发展,已经积累了大量数据,但对转运蛋白底物的搜索仍远未完成。在本文中,我们介绍METSP,这是一种最大熵分类器,用于从半结构化文本中检索转运蛋白-底物对(TSP)。基于UniProt的高质量注释,METSP在交叉验证实验中实现了高精度和召回率。当将METSP应用于UniProt中的182,829条人类转运蛋白注释语句时,它将识别3942个含有转运蛋白和复合信息的语句。最后,确定了1547个机密的人类TSP,以进行进一步的手动处理,其中58.37%与未在公共运输数据库中标注的新型底物配对。 METSP是第一个从UniProt中的半结构化注释文本中提取TSP的有效工具。该工具可帮助确定转运蛋白的精确底物和药物,从而促进药物靶标预测,代谢网络重建和文献分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号