首页> 外文会议>IEEE International Conference on Big Data and Cloud Computing >A Cloud-Assisted Application over Apache Spark for Investigating Epigenetic Markers on DNA Genome Sequences
【24h】

A Cloud-Assisted Application over Apache Spark for Investigating Epigenetic Markers on DNA Genome Sequences

机译:对Apache Spark的云辅助应用,用于研究DNA基因组序列上的表观遗传标记

获取原文

摘要

As an important epigenetic marker in identifying biological elements and processes in mammalian genomes, CpG islands (CGI) play significant roles in DNA methylation, gene regulation, epigenetic inheritance, gene mutation, chromosome inactivation and nuclesome retention. Investigating the CpG islands and their structures is a rigorous task because of unknown structures and the exponential number of possible patterns. In this paper, we design and implement an ad hoc application by combining the merits of Apache Spark platform and Spark programming paradigm with the particular properties of DNA genome sequences for CpG island investigation. A novel CpG box model and a Markov model are developed primarily for redefining and investigating the CpG island. Meanwhile, these models can easily fit to Spark-based cloud platforms that can greatly accelerate the analytic procedure. Two types of evaluations are successfully performed: one is accuracy-related and another is computing performance test. This paper is meant to describe this particular application on assisting the processing and the genomic analysis in epigenetic studies.
机译:作为鉴定哺乳动物基因组的生物元素和过程的重要表观遗传标记,CpG岛(CGI)在DNA甲基化,基因调节,表观遗传遗传,基因突变,染色体灭活和核肉内保留中起显着作用。调查CPG岛及其结构是一个严格的任务,因为未知的结构和指数的可能模式。在本文中,我们通过将Apache Spark平台和Spark编程范例的优点与CPG岛调查的DNA基因组序列的特定性质相结合,设计和实施Ad Hoc应用程序。新颖的CPG盒式模型和马尔可夫模型主要用于重新定义和调查CPG岛。同时,这些模型可以很容易地适应基于火花的云平台,可以大大加速分析程序。成功执行了两种类型的评估:一个是与准确性相关的,另一种是计算性能测试。本文旨在描述关于辅助外观遗传研究中的加工和基因组分析的这种特殊应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号