...
首页> 外文期刊>EURASIP journal on bioinformatics and systems biology >Identification of CpG islands in DNA sequences using statistically optimal null filters
【24h】

Identification of CpG islands in DNA sequences using statistically optimal null filters

机译:使用统计学上最佳的空滤器鉴定DNA序列中的CpG岛

获取原文
           

摘要

CpG dinucleotide clusters also referred to as CpG islands (CGIs) are usually located in the promoter regions of genes in a deoxyribonucleic acid (DNA) sequence. CGIs play a crucial role in gene expression and cell differentiation, as such, they are normally used as gene markers. The earlier CGI identification methods used the rich CpG dinucleotide content in CGIs, as a characteristic measure to identify the locations of CGIs. The fact, that the probability of nucleotide G following nucleotide C in a CGI is greater as compared to a non-CGI, is employed by some of the recent methods. These methods use the difference in transition probabilities between subsequent nucleotides to distinguish between a CGI from a non-CGI. These transition probabilities vary with the data being analyzed and several of them have been reported in the literature sometimes leading to contradictory results. In this article, we propose a new and efficient scheme for identification of CGIs using statistically optimal null filters. We formulate a new CGI identification characteristic to reliably and efficiently identify CGIs in a given DNA sequence which is devoid of any ambiguities. Our proposed scheme combines maximum signal-to-noise ratio and least squares optimization criteria to estimate the CGI identification characteristic in the DNA sequence. The proposed scheme is tested on a number of DNA sequences taken from human chromosomes 21 and 22, and proved to be highly reliable as well as efficient in identifying the CGIs.
机译:CpG二核苷酸簇也称为CpG岛(CGI),通常位于脱氧核糖核酸(DNA)序列的基因启动子区域。 CGI在基因表达和细胞分化中起着至关重要的作用,因此,它们通常用作基因标记。较早的CGI识别方法使用CGI中丰富的CpG二核苷酸含量作为识别CGI位置的特征量度。最近的一些方法采用了这样的事实,即CGI中核苷酸C后面的核苷酸G的概率比非CGI大。这些方法利用后续核苷酸之间的转移概率差异来区分CGI和非CGI。这些过渡概率随所分析的数据而变化,并且文献中已经报道了其中的一些,有时会导致相互矛盾的结果。在本文中,我们提出了一种使用统计上最优化的空过滤器来识别CGI的高效新方案。我们制定了一种新的CGI识别特征,可以可靠,有效地识别给定DNA序列中的CGI,而这不会造成任何歧义。我们提出的方案结合了最大的信噪比和最小二乘优化标准,以估计DNA序列中的CGI识别特征。在从人类21号和22号染色体上提取的许多DNA序列上测试了所提出的方案,并被证明在鉴定CGI方面非常可靠且有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号