首页> 外文期刊>BioData Mining >SEQdata-BEACON: a comprehensive database of sequencing performance and statistical tools for performance evaluation and yield simulation in BGISEQ-500
【24h】

SEQdata-BEACON: a comprehensive database of sequencing performance and statistical tools for performance evaluation and yield simulation in BGISEQ-500

机译:SEQDATA-BEACON:BGISEQ-500中的绩效评估和统计工具的综合数据库,用于BGISEQ-500的绩效评估和产量模拟

获取原文
           

摘要

The sequencing platform BGISEQ-500 is based on DNBSEQ technology and provides high throughput with low costs. This sequencer has been widely used in various areas of scientific and clinical research. A better understanding of the sequencing process and performance of this system is essential for stabilizing the sequencing process, accurately interpreting sequencing results and efficiently solving sequencing problems. To address these concerns, a comprehensive database, SEQdata-BEACON, was constructed to accumulate the run performance data in BGISEQ-500. A total of 60 BGISEQ-500 instruments in the BGI-Wuhan lab were used to collect sequencing performance data. Lanes in paired-end 100 (PE100) sequencing using 10?bp barcode were chosen, and each lane was assigned a unique entry number as its identification number (ID). From November 2018 to April 2019, 2236 entries were recorded in the database containing 65 metrics about sample, yield, quality, machine state and supplies information. Using a correlation matrix, 52 numerical metrics were clustered into three groups signifying yield-quality, machine state and sequencing calibration. The distributions of the metrics also delivered information about patterns and rendered clues for further explanation or analysis of the sequencing process. Using the data of a total of 200?cycles, a linear regression model well simulated the final outputs. Moreover, the predicted final yield could be provided in the 15th cycle of the early stage of sequencing, and the corresponding R2 of the 200th and 15th cycle models were 0.97 and 0.81, respectively. The model was run with the test sets obtained from May 2019 to predict the yield, which resulted in an R2 of 0.96. These results indicate that our simulation model was reliable and effective. Data sources, statistical findings and application tools provide a constantly updated reference for BGISEQ-500 users to comprehensively understand DNBSEQ technology, solve sequencing problems and optimize run performance. These resources are available on our website http://seqBEACON.genomics.cn:443/home.html.
机译:测序平台BGISEQ-500基于DNBSEQ技术,并提供低成本的高吞吐量。该序列机已广泛应用于科学和临床研究的各个领域。更好地理解该系统的测序过程和性能对于稳定测序过程,准确地解释测序结果和有效解决测序问题是必不可少的。为解决这些问题,构建了一个全面的数据库,SEQDATA-BEACON,以累积BGISEQ-500中的运行性能数据。 BGI-Wuhan Lab中共有60个BGISEQ-500仪器用于收集测序性能数据。选择使用10?BP条形码的配对端100(PE100)测序的泳道,每个通道被分配唯一的条目号作为其识别号(ID)。从2018年11月到2019年4月,在数据库中记录了2236个条目,其中包含有关样品,产量,质量,机器状态和供应信息的65个指标。使用相关矩阵,将52个数值指标聚集成三组表示屈服质量,机器状态和测序校准。度量的分布还提供了有关模式的信息和呈现线索的信息,以进一步解释或分析测序过程。使用总共200个循环的数据,一个线性回归模型井模拟了最终输出。此外,可以在测序的早期阶段的第15次循环中提供预测的最终产量,而第200次和第15周期模型的相应R 2分别为0.97和0.81。该模型与从2019年5月获得的测试集一起运行,以预测结果,其导致R2为0.96。这些结果表明,我们的仿真模型可靠而有效。数据源,统计发现和应用工具为BGiseq-500用户提供了不断更新的参考,以全面了解DNBSeq技术,解决测序问题并优化运行性能。这些资源可在我们的网站http://seqbeacon.genomics.cn:443/home.html上获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号