首页> 外文期刊>Journal of biomolecular techniques :JBT. >XSQ: A New Binary Output Format of 5500/5500XL Systems
【24h】

XSQ: A New Binary Output Format of 5500/5500XL Systems

机译:XSQ:5500 / 5500XL系统的新二进制输出格式

获取原文
           

摘要

The XSQ (eXtensible SeQuence) format was designed to accommodate different run types (standard SOLiD sequencing and exact chemistry call (ECC) runs), simplify indexing samples workflow and support new data types such as ECC in the new 5500 sequencing instrument; these data cannot be stored in any existing formats (e.g. csfasta & QV.qual) as there are more than one call per position. Other problems with existing formats are the file size, I/O demand, and lack of pairing of reads. The XSQ format uses information packing to reduce the file size by 60%, resulting in reduced storage needs and reduced transfer times. It also has integrated pairing of reads, so mapping and pairing can happen together much more quickly, rather than having the two reads of a pair mapped separately and then merged subsequently. The new XSQ file allows users to perform very efficient indexing reassignment and such reassignment only introduces minimal impact to downstream analysis instead of reanalyzing all indexing samples. The hierarchical data format also provides a basic level of partitioning and indexing within the file so that subsets of the data can be retrieved without reading through the entire file.
机译:XSQ(可扩展序列)格式旨在适应不同的运行类型(标准SOLiD测序和精确化学调用(ECC)运行),简化索引样品的工作流程并在新的5500测序仪器中支持诸如ECC的新数据类型;这些数据不能以任何现有格式(例如csfasta和QV.qual)存储,因为每个职位有多个通话。现有格式的其他问题是文件大小,I / O需求以及缺少读取配对。 XSQ格式使用信息打包将文件大小减少60%,从而减少了存储需求并缩短了传输时间。它还集成了读取的配对,因此映射和配对可以更快地一起发生,而不是将一对的两个读取分别映射并随后合并。新的XSQ文件允许用户执行非常有效的索引重新分配,并且这种重新分配只会对下游分析产生最小的影响,而无需重新分析所有索引样本。分层数据格式还提供了文件内分区和索引编制的基本级别,因此无需重新读取整个文件就可以检索数据的子集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号