首页> 外文期刊>Bioinformatics >Identification of regulatory elements using a feature selection method.
【24h】

Identification of regulatory elements using a feature selection method.

机译:使用特征选择方法识别监管要素。

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Many methods have been described to identify regulatory motifs in the transcription control regions of genes that exhibit similar patterns of gene expression across a variety of experimental conditions. Here we focus on a single experimental condition, and utilize gene expression data to identify sequence motifs associated with genes that are activated under this experimental condition. We use a linear model with two-way interactions to model gene expression as a function of sequence features (words) present in presumptive transcription control regions. The most relevant features are selected by a feature selection method called stepwise selection with monte carlo cross validation. We apply this method to a publicly available dataset of the yeast Saccharomyces cerevisiae, focussing on the 800 basepairs immediately upstream of each gene's translation start site (the upstream control region (UCR)). Results: We successfully identify regulatory motifs that are known to be active under the experimental conditions analyzed, and find additional significant sequences that may represent novel regulatory motifs. We also discuss a complementary method that utilizes gene expression data from a single microarray experiment and allows averaging over variety of experimental conditions as an alternative to motif finding methods that act on clusters of co-expressed genes. Availability: The software is available upon request from the first author or may be downloaded from http://www.stat.berkeley.edu/~sunduz. Contact: keles
机译:动机:已经描述了许多方法来鉴定基因的转录控制区中的调控基序,这些调控基序在各种实验条件下均表现出相似的基因表达模式。在这里,我们专注于单个实验条件,并利用基因表达数据来识别与在该实验条件下激活的基因相关的序列基序。我们使用具有双向交互作用的线性模型来对基因表达进行建模,以作为假定的转录控制区中存在的序列特征(单词)的函数。最相关的特征是通过称为逐步选择与蒙特卡洛交叉验证的特征选择方法选择的。我们将这种方法应用于酵母酿酒酵母的公开可用数据集,重点是紧接每个基因翻译起始位点(上游控制区(UCR))上游的800个碱基对。结果:我们成功地确定了已知在所分析的实验条件下具有活性的调控基序,并发现了可能代表新颖调控基序的其他重要序列。我们还讨论了一种补充方法,该方法利用来自单个微阵列实验的基因表达数据,并允许对各种实验条件进行平均,以作为对共表达基因簇起作用的基序发现方法的替代方法。可用性:该软件可应第一作者的要求提供,也可以从http://www.stat.berkeley.edu/~sunduz下载。联系人︰keles

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号