首页> 外文期刊>Methods: A Companion to Methods in Enzymology >DEGPACK: A web package using a non-parametric and information theoretic algorithm to identify differentially expressed genes in multiclass RNA-seq samples
【24h】

DEGPACK: A web package using a non-parametric and information theoretic algorithm to identify differentially expressed genes in multiclass RNA-seq samples

机译:DEGPACK:一种网络软件包,使用非参数信息理论算法识别多类RNA序列样品中差异表达的基因

获取原文
获取原文并翻译 | 示例
           

摘要

Gene expression in the whole cell can be routinely measured by microarray technologies or recently by using sequencing technologies. Using these technologies, identifying differentially expressed genes (DEGs) among multiple phenotypes is the very first step to understand difference between phenotypes. Thus many methods for detecting DEGs between two groups have been developed. For example, T-test and relative entropy are used for detecting difference between two probability distributions. When more than two phenotypes are considered, these methods are not applicable and other methods such as ANOVA F-test and Kruskal-Wallis are used for finding DEGs in the multiclass data. However, ANOVA F-test assumes a normal distribution and it is not designed to identify DEGs where genes are expressed distinctively in each of phenotypes. Kruskal-Wallis method, a non-parametric method, is more robust but sensitive to outliers. In this paper, we propose a non-parametric and information theoretical approach for identifying DEGs. Our method identified DEGs effectively and it is shown less sensitive to outliers in two data sets: a three-class drought resistant rice data set and a three-class breast cancer data set. In extensive experiments with simulated and real data, our method was shown to outperform existing tools in terms of accuracy of characterizing phenotypes using DEGs. (C) 2014 Elsevier Inc. All rights reserved.
机译:整个细胞中的基因表达可以通过微阵列技术或最近使用测序技术进行常规检测。使用这些技术,识别多种表型之间的差异表达基因(DEG)是了解表型之间差异的第一步。因此,已经开发了许多用于检测两组之间的DEG的方法。例如,T检验和相对熵用于检测两个概率分布之间的差异。当考虑两个以上的表型时,这些方法不适用,并且使用其他方法(例如ANOVA F检验和Kruskal-Wallis)在多类数据中查找DEG。但是,方差分析F检验假设呈正态分布,并且其目的不是识别基因在每种表型中均独特表达的DEG。非参数方法Kruskal-Wallis方法更健壮,但对异常值敏感。在本文中,我们提出了一种用于确定DEG的非参数信息理论方法。我们的方法有效地识别了DEG,并且在两个数据集中显示出对异常值的敏感性较低:三个类别的抗旱水稻数据集和三个类别的乳腺癌数据集。在使用模拟和真实数据进行的广泛实验中,我们的方法在使用DEG表征表型的准确性方面表现优于现有工具。 (C)2014 Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号