Journal of Physical Chemistry B, Vol.113, No.9, 2874-2895, 2009
Nonspecific Hybridization Scaling of Microarray Expression Estimates: A Physicochemical Approach for Chip-to-Chip Normalization
The problem of inferring accurate quantitative estimates of transcript abundances from gene expression microarray data is addressed. Particular attention is paid to correcting chip-to-chip variations arising mainly as a result of unwanted nonspecific background hybridization to give transcript abundances measured in a common scale. This study verifies and generalizes a model of the mutual dependence between nonspecific background hybridization and the sensitivity of the specific signal using an approach based on the physical chemistry of surface hybridization. We have analyzed GeneChip oligonucleotide microarray data taken from a set of five benchmark experiments including dilution, Latin Square, and "Golden spike" designs. Our analysis concentrates on the important effect of changes in the unwanted nonspecific background inherent in the technology due to changes in total RNA target concentration and/or composition. We find that incremental changes in nonspecific background entail opposite sign incremental changes in the effective specific binding constant. This effect, which we refer to as the "up-down" effect, results from the subtle interplay of competing interactions between the probes and specific and nonspecific targets at the chip surface and in bulk solution. We propose special rules for proper normalization of expression values considering the specifics of the up-down effect. Particularly for normalization one has to level the expression values of invariant expressed probes. Existing heuristic normalization techniques which do not exclude absent probes, level intensities instead of expression values, and/or use low variance criteria for identifying invariant sets of probes lead to biased results. Strengths and pitfalls of selected normalization methods are discussed. We also find that the extent of the up-down effect is modified if RNA targets are replaced by DNA targets, in that microarray sensitivity and specificity are improved via a decrease in nonspecific background, which effectively amplifies specific binding. The results emphasize the importance of physicochemical approaches for improving heuristic normalization algorithms to proceed toward quantitative microarray data analysis.