Link to Pubmed [PMID] – 21626533
Biom J 2011 Jul;53(4):673-88
Extreme values in predictors often strongly affect the results of statistical analyses in high-dimensional settings. Although they frequently occur with most high-throughput techniques, the problem is often ignored in the literature. We suggest to use a very simple transformation, proposed before in a different context by Royston and Sauerbrei, as an intermediary step between array preprocessing and high-level statistical analysis. This straightforward univariate transformation identifies extreme values in continuous features and can thus be used as a diagnostic tool for outliers. The use of the transformation and its effects is demonstrated for diverse univariate and multivariate statistical analyses using nine publicly available microarray data sets.
https://www.ncbi.nlm.nih.gov/pubmed/21626533