An important aspect in data processing for the four dye fluorescence-based DNA sequencing is the crosstalk filtering. Typically, a matrix M, which is function on the fluorophores employed and the fluorescence detection system, is used in the multicomponent analysis. In the deconvolution process the matrix is applied directly to the raw signal, on a linear crosstalk assumption. This necessitates the signal to be aligned to the baseline. The various techniques used for aligning the raw data have as negative effect an additional distortion of the signal. An algorithm for crosstalk removal is presented. It is based on using the variation of the raw signal (instead of the signal itself) thus making possible the crosstalk removal before the base line adjustment. In addition, a supplementary filtering step is proposed in order to account for the fact that the crosstalk is in reality nonlinear. This second step is based on a matrix T and accounts for the influence on each of the signals from the derivatives of the other three. The overall result is less information lost through filtering. Our strategy helps in preserving the information contained in the raw data. Consequently, we consider the data processed using our proposed algorithm can allow for a better accuracy in base calling.
展开▼