Motivation: High-throughput measurements of mRNA abundances from microarrays involve several stages of preprocessing. At each stage, a user has access to a large number of algorithms with no universally agreed guidance on which of these to use. We show that binary representations of gene expressions, retaining only information on whether a gene is expressed or not, reduces the variability in results caused by algorithmic choice, while also improving the quality of inference drawn from microarray studies.
展开▼