首页> 外文期刊>International journal of knowledge discovery in bioinformatics >Genome Sequence Analysis in Distributed Computing using Spark
【24h】

Genome Sequence Analysis in Distributed Computing using Spark

机译:使用Spark进行分布式计算中的基因组序列分析

获取原文
获取原文并翻译 | 示例
       

摘要

Integration of Computer Science with Bio Science has led to new field Computational Biology which created an opportunity in speeding up the process of analyzing the Bio-data. DNA sequence analysis especially finding the base pairs tlrat helps in identifying the order of nucleotides present in all living beings, it also helps in forensics for DNA profiling and parenting testing. This sequence analysis has been a challenging task in Computational Biology due to large volumes of data and need of more computational resources. Using a distributed file system with distributed computation of tasks can be one of the solutions to above problem. In this paper, the authors use Spark a query engine for large-scale data processing in analyzing the DNA sequence and extracting the base pairs and also they try to improve base pair extraction with improvised algorithms.
机译:计算机科学与生物科学的融合带来了计算生物学的新领域,这为加速分析生物数据的过程创造了机会。 DNA序列分析,尤其是发现tlrat碱基对,有助于确定所有生物中存在的核苷酸的顺序,也有助于进行DNA分析和育儿测试的法医。由于数据量大且需要更多的计算资源,因此此序列分析已成为计算生物学中的一项艰巨任务。将分布式文件系统与任务的分布式计算结合使用可能是上述问题的解决方案之一。在本文中,作者使用Spark查询引擎进行大规模数据处理,以分析DNA序列并提取碱基对,并尝试通过改进算法改进碱基对的提取。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号