...
首页> 外文期刊>Journal of applied statistics >A classification updating procedure motivated by high-content screening data
【24h】

A classification updating procedure motivated by high-content screening data

机译:高内涵筛选数据驱动的分类更新程序

获取原文
获取原文并翻译 | 示例
           

摘要

The current paradigm for the identification of candidate drugs within the pharmaceutical industry typically involves the use of high-throughput screens. High-content screening (HCS) is the term given to the process of using an imaging platform to screen large numbers of compounds for some desirable biological activity. Classification methods have important applications in HCS experiments, where they are used to predict which compounds have the potential to be developed into new drugs. In this paper, a new classification method is proposed for batches of compounds where the rule is updated sequentially using information from the classification of previous batches. This methodology accounts for the possibility that the training data are not a representative sample of the test data and that the underlying group distributions may change as new compounds are analysed. This technique is illustrated on an example data set using linear discriminant analysis, *>nearest neighbour and random forest classifiers. Random forests are shown to be superior to the other classifiers and are further improved by the additional updating algorithm in terms of an increase in the number of true positives as well as a decrease in the number of false positives.
机译:在制药行业中用于识别候选药物的当前范例通常涉及使用高通量筛选。高含量筛选(HCS)是赋予使用成像平台筛选大量化合物进行某些所需生物活性的过程的术语。分类方法在HCS实验中具有重要的应用,用于预测哪些化合物有可能被开发为新药。本文提出了一种新的化合物批次分类方法,其中使用来自先前批次的分类信息来依次更新规则。该方法论解释了以下可能性:训练数据不是测试数据的代表性样本,并且随着新化合物的分析,潜在的基团分布可能会发生变化。使用线性判别分析,*>最近邻居和随机森林分类器在示例数据集上说明了此技术。随机森林显示出优于其他分类器,并且通过增加更新算法在真实阳性数的增加以及在错误阳性数的减少方面得到了进一步的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号