首页> 外文期刊>Combinatorial chemistry & high throughput screening >MegaMiner: A Tool for Lead Identification Through Text Mining Using Chemoinformatics Tools and Cloud Computing Environment
【24h】

MegaMiner: A Tool for Lead Identification Through Text Mining Using Chemoinformatics Tools and Cloud Computing Environment

机译:MegaMiner:使用化学信息学工具和云计算环境通过文本挖掘进行铅识别的工具

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Virtual screening is an indispensable tool to cope with the massive amount of data being tossed by the high throughput omics technologies. With the objective of enhancing the automation capability of virtual screening process a robust portal termed MegaMiner has been built using the cloud computing platform wherein the user submits a text query and directly accesses the proposed lead molecules along with their drug-like, lead-like and docking scores. Textual chemical structural data representation is fraught with ambiguity in the absence of a global identifier. We have used a combination of statistical models, chemical dictionary and regular expression for building a disease specific dictionary. To demonstrate the effectiveness of this approach, a case study on malaria has been carried out in the present work. MegaMiner offered superior results compared to other text mining search engines, as established by F score analysis. A single query term 'malaria' in the portlet led to retrieval of related PubMed records, protein classes, drug classes and 8000 scaffolds which were internally processed and filtered to suggest new molecules as potential anti-malarials. The results obtained were validated by docking the virtual molecules into relevant protein targets. It is hoped that MegaMiner will serve as an indispensable tool for not only identifying hidden relationships between various biological and chemical entities but also for building better corpus and ontologies.
机译:虚拟筛选是应对高通量组学技术抛弃大量数据的必不可少的工具。为了增强虚拟筛选过程的自动化能力,已使用云计算平台构建了一个名为MegaMiner的强大门户网站,其中用户提交文本查询并直接访问建议的铅分子以及它们的类药物,类铅和对接分数。在没有全局标识符的情况下,文本化学结构数据表示充满歧义。我们使用了统计模型,化学词典和正则表达式的组合来构建特定疾病的词典。为了证明这种方法的有效性,目前在疟疾方面进行了案例研究。与通过F得分分析确定的其他文本挖掘搜索引擎相比,MegaMiner提供了更好的结果。 Portlet中的单个查询词“疟疾”导致检索了相关的PubMed记录,蛋白质类别,药物类别和8000个支架,这些支架在内部进行了处理和过滤,以暗示新的分子可能作为抗疟疾药物。通过将虚拟分子对接至相关的蛋白质靶标来验证获得的结果。希望MegaMiner将成为必不可少的工具,不仅可以识别各种生物和化学实体之间的隐藏关系,而且可以构建更好的语料库和本体。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号