Link to Pubmed [PMID] – 21626533
Link to DOI – 10.1002/bimj.201000189
Biom J 2011 Jul; 53(4): 673-88
Extreme values in predictors often strongly affect the results of statistical analyses in high-dimensional settings. Although they frequently occur with most high-throughput techniques, the problem is often ignored in the literature. We suggest to use a very simple transformation, proposed before in a different context by Royston and Sauerbrei, as an intermediary step between array preprocessing and high-level statistical analysis. This straightforward univariate transformation identifies extreme values in continuous features and can thus be used as a diagnostic tool for outliers. The use of the transformation and its effects is demonstrated for diverse univariate and multivariate statistical analyses using nine publicly available microarray data sets.