Link to Pubmed [PMID] – 26351271
Link to DOI – 10.1093/bioinformatics/btv529
Bioinformatics 2016 Jan; 32(1): 85-95
Recent large-scale omics initiatives have catalogued the somatic alterations of cancer cell line panels along with their pharmacological response to hundreds of compounds. In this study, we have explored these data to advance computational approaches that enable more effective and targeted use of current and future anticancer therapeutics.We modelled the 50% growth inhibition bioassay end-point (GI50) of 17,142 compounds screened against 59 cancer cell lines from the NCI60 panel (941,831 data-points, matrix 93.08% complete) by integrating the chemical and biological (cell line) information. We determine that the protein, gene transcript and miRNA abundance provide the highest predictive signal when modelling the GI50 endpoint, which significantly outperformed the DNA copy-number variation or exome sequencing data (Tukey’s Honestly Significant Difference, P <0.05). We demonstrate that, within the limits of the data, our approach exhibits the ability to both interpolate and extrapolate compound bioactivities to new cell lines and tissues and, although to a lesser extent, to dissimilar compounds. Moreover, our approach outperforms previous models generated on the GDSC dataset. Finally, we determine that in the cases investigated in more detail, the predicted drug-pathway associations and growth inhibition patterns are mostly consistent with the experimental data, which also suggests the possibility of identifying genomic markers of drug sensitivity for novel compounds on novel cell lines.terez@pasteur.fr; ab454@ac.cam.ukSupplementary data are available at Bioinformatics online.