首页> 外文期刊>BMC Bioinformatics >KAOS: a new automated computational method for the identification of overexpressed genes
【24h】

KAOS: a new automated computational method for the identification of overexpressed genes

机译:KAOS:一种用于识别过表达基因的新型自动化计算方法

获取原文
       

摘要

Background Kinase over-expression and activation as a consequence of gene amplification or gene fusion events is a well-known mechanism of tumorigenesis. The search for novel rearrangements of kinases or other druggable genes may contribute to understanding the biology of cancerogenesis, as well as lead to the identification of new candidate targets for drug discovery. However this requires the ability to query large datasets to identify rare events occurring in very small fractions (1–3?%) of different tumor subtypes. This task is different from what is normally done by conventional tools that are able to find genes differentially expressed between two experimental conditions. Results We propose a computational method aimed at the automatic identification of genes which are selectively over-expressed in a very small fraction of samples within a specific tissue. The method does not require a healthy counterpart or a reference sample for the analysis and can be therefore applied also to transcriptional data generated from cell lines. In our implementation the tool can use gene-expression data from microarray experiments, as well as data generated by RNASeq technologies. Conclusions The method was implemented as a publicly available, user-friendly tool called KAOS (Kinase Automatic Outliers Search). The tool enables the automatic execution of iterative searches for the identification of extreme outliers and for the graphical visualization of the results. Filters can be applied to select the most significant outliers. The performance of the tool was evaluated using a synthetic dataset and compared to state-of-the-art tools. KAOS performs particularly well in detecting genes that are overexpressed in few samples or when an extreme outlier stands out on a high variable expression background. To validate the method on real case studies, we used publicly available tumor cell line microarray data, and we were able to identify genes which are known to be overexpressed in specific samples, as well as novel ones.
机译:背景技术由于基因扩增或基因融合事件而导致的激酶过表达和活化是众所周知的肿瘤发生机理。寻找激酶或其他可药用基因的新型重排可能有助于理解癌症发生的生物学机制,并有助于确定新的候选药物发现靶标。但是,这需要查询大型数据集的能力,以识别在不同肿瘤亚型的极小部分(1-3%)中发生的罕见事件。该任务与常规工具通常能够完成的工作不同,常规工具能够找到两个实验条件之间差异表达的基因。结果我们提出了一种计算方法,旨在自动识别在特定组织中很小一部分样品中选择性过表达的基因。该方法不需要健康的对应物或参考样品进行分析,因此也可以应用于从细胞系生成的转录数据。在我们的实现中,该工具可以使用微阵列实验中的基因表达数据以及RNASeq技术生成的数据。结论该方法是作为公开可用的,用户友好的工具KAOS(激酶自动离群值搜索)实现的。该工具可以自动执行迭代搜索,以识别极端离群值并以图形方式显示结果。可以应用过滤器来选择最重要的离群值。使用综合数据集评估了该工具的性能,并将其与最新工具进行了比较。 KAOS在检测少数样品中过表达的基因或极端变异在高可变表达背景上脱颖而出的基因时表现尤其出色。为了在实际案例研究中验证该方法,我们使用了可公开获得的肿瘤细胞系微阵列数据,并且我们能够鉴定出在特定样品以及新样品中过表达的基因。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号