首页> 外文学位 >Genomic approaches to the study of splicing in Plasmodium falciparum and other organisms using high throughput sequencing.
【24h】

Genomic approaches to the study of splicing in Plasmodium falciparum and other organisms using high throughput sequencing.

机译:使用高通量测序研究恶性疟原虫和其他生物中剪接的基因组方法。

获取原文
获取原文并翻译 | 示例

摘要

In the last five years, high throughput sequencing has revolutionized biological research. The ability to quickly generate millions of short sequence reads enables studies that would have been inconceivable even 10 years ago. This work focuses on RNA-Seq, the application of high throughput sequencing to an organism's transcriptome. We describe a method of library preparation that improves sequence coverage, a new algorithm for detecting splice junctions in the datasets, and finally, application of these techniques to the study of splicing in Plasmodium falciparum.;The long march is a technique for Solexa library preparation that increases contig length and target sequence coverage. The long march incorporates a Type IIS restriction enzyme into the sequencing primer adapter. Each round of marching cuts off the initial part of the read and ligates a new adapter downstream, creating overlapping reads. Validation on P. falciparum genomic and human hepatitis B virus positive samples showed 39% and 42%, respectively, increases in numbers of bases covered.;Next we developed an algorithm to detect spliced reads crossing exon-exon junctions in RNA-Seq datasets. Our algorithm uses an unbiased approach, relying only on the read dataset and a reference genome, detecting canonical and noncanonical splice junctions. This works by dividing reads in half for initial seeding in the reference genome then using an HMM, trained on the input data, to determine the optimal splice position. Our algorithm provides a score for each splice junction, which allows researchers to tune the false positive rate to the requirements of their experiment. This approach identifies more splice junctions than currently available algorithms, without a reduction in specificity, when tested on publicly available datasets for Arabidopsis thaliana, Plasmodium falciparum, and Homo sapiens.;Finally, our library preparation technique and splice detection algorithm were used to study splicing in P. falciparum. Both our data and publicly available datasets were used to identify splicing events in the blood stages of the parasite. We confirmed 6,678 previously known introns and identified 977 novel introns with canonical splice edges. In addition, we detected 310 alternative slicing events as well as splicing events antisense to known transcripts.
机译:在过去的五年中,高通量测序彻底改变了生物学研究。快速生成数百万个短序列读数的能力使研究即使在10年前也是无法想象的。这项工作的重点是RNA-Seq,即高通量测序在生物体转录组中的应用。我们描述了一种提高序列覆盖率的文库制备方法,一种用于检测数据集中剪接点的新算法,最后将这些技术应用于恶性疟原虫的剪接研究中。从而增加重叠群的长度和靶序列的覆盖范围。这次长征将IIS型限制酶整合到测序引物衔接子中。每一轮进行会切断读取的初始部分,并在下游连接新的适配器,从而产生重叠的读取。对恶性疟原虫基因组和人类乙型肝炎病毒阳性样品的验证分别显示覆盖的碱基数增加了39%和42%。;接下来,我们开发了一种算法来检测跨RNA外显子-外显子连接的剪接读数。我们的算法使用无偏方法,仅依靠读取的数据集和参考基因组,检测规范和非规范的剪接点。通过将读数分成两半以在参考基因组中初始接种,然后使用在输入数据上经过训练的HMM来确定最佳剪接位置,可以实现这一点。我们的算法为每个接合点提供一个分数,这使研究人员可以根据实验要求调整假阳性率。当在拟南芥,恶性疟原虫和智人的公开数据集上进行测试时,这种方法比目前可用的算法识别出更多的剪接连接,而没有降低特异性;最后,我们的文库制备技术和剪接检测算法用于研究剪接在恶性疟原虫中。我们的数据和公开可用的数据集都用于识别寄生虫血液阶段的剪接事件。我们确认了6,678个先前已知的内含子,并鉴定了977个具有标准剪接边缘的新内含子。此外,我们检测到310个替代切片事件以及与已知转录本反义的剪接事件。

著录项

  • 作者

    Dimon, Michelle.;

  • 作者单位

    University of California, San Francisco.;

  • 授予单位 University of California, San Francisco.;
  • 学科 Biology Bioinformatics.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 156 p.
  • 总页数 156
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号