The GDSC1000 collection comprises well over 1,000 human tumour cell lines. This panel represents the spectrum of common and rare types of adult and childhood cancers of epithelial, mesenchymal and haematopoietic origin. Cell lines have been categorised based on therapeutically relevant tissue descriptions (GDSC descriptions 1 and 2), as well as using the TCGA tumour type descriptions. All cell lines and associated meta-data are identified and linked by a unique COSMIC ID.
Cell lines were sourced from commercial vendors and occasionally academic collaborators. Cells were grown in RPMI or DMEM/F12 medium supplemented with 5% or 10% FBS and penicillin/streptomycin, and maintained at 37°C in a humidified atmosphere at 5% CO2. Cell lines were propagated in these two media in order to minimize the potential effect of varying the media on sensitivity to therapeutic compounds in our assay, and to facilitate high-throughput screening.
To exclude cross-contaminated or synonymous lines, a panel of 92 SNPs was profiled for each cell line (Sequenom, San Diego, CA) and a pair-wise comparison score calculated for in-house identity checking. In addition, we have confirmed the identity of our cancer cell line set against those provided by the repositories, where possible. Each of the cell lines within our core set has been tested using a panel of 16 STRs (AmpFLSTR Identifiler KIT, ABI), which includes the 9 currently used by most of the cell line repositories (ATCC, Riken, JCRB and DSMZ). STR or SNP datasets for each cell line can be accessed through the cancer cell line pages of the COSMIC database (http://cancer.sanger.ac.uk/cell_lines#).
A complete list of cell lines and associated data are available from the downloads page.
Cell lines have been comprehensively genetically characterised including:
    1. Whole exome sequencing (Agilent SureSelectXT Human All Exon 50Mb bait set)
    2. Gene expression (Affymetrix Human Genome U219 Array)
    3. Copy number alterations (Affymetrix SNP6.0 Array)
    4. DNA methylation (Illumina Human Methylation 450 Array)
    5. Gene fusions (targeted PCR sequencing or split probe FISH analysis)
    6. Microsatellite Instability (markers BAT25, BAT26, D5S346, D2S123 and D17S250)
Further information is available from the Download page. All datasets are available from the GDSC website, COSMIC database or the appropriate repository (ArrayExpress, GEO, EGA)
Compounds are provided by industry, academic collaborators or sourced from commercial vendors. The range of concentrations selected for each compound is based on in vitro data of concentrations inhibiting relevant kinase activity and cell viability, as well as clinical data indicating peak and trough plasma concentrations in human subjects.
Cells are seeded at an optimised density in medium with 5% or 10% FBS and 1% penicillin/streptomycin. The optimal cell number for each cell line is determined to ensure that it is in growth-phase at the end of the assay and to maximise the dynamic range of endpoint measurements. 24 hours after plating, cells are treated with a dose titration of each compound, except for lines screened at MGH where drugging occurs the same day as plating. Following drugging, plates are returned to the incubator for assay at a 72-hour time point. (Cell-lines screened at MGH are drugged the same day as plating). Cell viability is determined using either a DNA dye (Syto60) or metabolic assay (Resazurin or CellTiter-Glo). All screening plates are subject to stringent quality control measures.
The GDSC datasets reflect different experimental setups employed by the project since its inception. GDSC1 is an expansion of the original dataset available from this website and published by Iorio et al. (Cell 2016). GDSC2 has been screened using improved equipment and procedures (see below). Many experiments from GDSC1 have been repeated in GDSC2 and we would recommend, where duplicate IC50s exist, using the result from GDSC2. Raw and fitted data, and ANOVA results are available for GDSC1 and GDSC2 on the downloads page.
GDSC1
The GDSC1 dataset was generated jointly by the Wellcome Sanger Institute and Massachusetts General Hospital between 2009 and 2015 using a matched set of cancer cell lines (the GDSC1000).
Compounds were stored in aliquots at -80°C and were subjected to a maximum of 5 freeze-thaw cycles.
Cells were seeded in 96-well or 384-well plates and compound dose titrations were delivered using tip based liquid handling apparatus. Cell viability was measured using either Syto60 or Resazurin. Drug treatments in this dataset used two formats:
GDSC2
GDSC2 has been generated at the Wellcome Sanger Institute since 2015 following improvements to the screen design and assay.
Compounds are stored in Storage Pods (Roylan Developments) providing a moisture-free, low oxygen environment, and protection from UV damage.
Cells are seeded in 1536-well plates and an Echo555 Acoustic Dispenser (Labcyte) used to deliver compound doses. Promega CellTiter-Glo is used to measure cell viability at the assay endpoint. Drug treatments use a standard dose response format:
Datasets are analysed independently. Raw viability readouts are processed using the R package gdscIC50. Viability data are normalized per plate using available negative and positive controls:
Dose-response curves are fitted using the non-linear mixed effects model of Vis et al., incorporated in the gdscIC50 package. All available replicates for an experimental combination (cell line + compound) are used to fit each curve and obtain IC50 and AUC estimates (previous editions of GDSC data have fitted a single dose response). Biomarker discovery uses the GDSCTools python package of Cokelaer et al. to run ANOVA for each dataset independently.
Curve fitting
Fluorescence intensity data from screening plates for each dose response curve is fitted using a multi-level fixed effect model (PubMed ID: 27180993). The viability of the concentration dilution series is assumed to be sigmoidal, the classical dose-response S-shape. This function is fitted to all of the cell line - compound combinations screened. In the multilevel mixed effect model used here, two parameters are used to describe the sigmoidal curve. However, instead of fitting each dose-response series in isolation, the complete set with all combinations of cell lines - compounds screened, is fitted simultaneously. The shape parameter varies only across cell lines, while the position parameter varies across cell lines and compounds. This is a faithful and efficient representation of the data, but most importantly, it allows for borrowing strength by using all observations, which in turn allows for more accurate IC50 estimates.
To identify genomic features associated with drug response an analysis of variance (ANOVA) is used to correlate drug response (IC50 values) with genomic alterations in cancer cells including point mutations, recurrently copy number altered chromosomal segments and selected cancer gene re-arrangements (see below for details).
A pan-cancer analysis was performed using all cell lines for which drug response data were available as well as a cancer-specific analyses for each specific cancer type where sufficient data are available.
Below are some guidelines that may be useful when interpreting the data from these analyses:
To guide our statistical analyses we have built a comprehensive map of the oncogenic aberrations in >11,000 human tumors using publically available data from TCGA, ICGC and other studies. This map includes: 1) genes whose mutation patterns in whole exome sequencing (WES) data are consistent with positive selection; and 2) focal recurrently aberrant copy number segments from SNP6 array profiles (RACSs). We identified cancer functional events by combining data across all tumors (pan-cancer) as well as for each cancer type (cancer-specific).
Driver mutations in cancer genes were detected by combining the outputs of three algorithms: MutSigCV, OncodriveFM and OncodriveCLUST (Gonzalez-Perez and Lopez-Bigas, 2012; Lawrence et al., 2013; Rubio-Perez et al., 2015; Tamborero et al., 2013a). Furthermore, we mined the COSMIC database to identify recurrent, and therefore likely oncogenic, gain- or loss-of-function variants within these cancer genes. The detection of RACSs was performed using the ADMIRE algorithm (Chapman et al., 2011; Mok et al., 2009; Shaw et al., 2013; van Dyk et al., 2013). RACs were filtered to require segments to include at least one protein coding or antisense gene, but no more than 100.
The set of clinically relevant features identified from patient tumours was utilised for subsequent downstream ANOVA analysis to identify cancer features associated with drug response in cancer cell lines.
We perform an analysis of variance (ANOVA) to associated drug sensitivity with individual genomic features. A drug–response vector consisting of n IC50 values from treatment of n cell lines was constructed for each drug. The model was linear (no interaction terms) with dependant variables represented by the described vector and factors including tissue type (for the pan-cancer analysis only), micro-satellite instability status (for the cancer types with positive samples for this feature) and the status of a cancer features. For the pan-cancer analysis, the union of all the cancer-specific features was used. Only cancer features occurring in at least 3 cell lines were considered and features with identical pattern of positive occurrence were merged together, thus resulting into a final set of 667 (individual or combined) features across 988 cell lines (screened against at least 1 drug). In order to include as many cell lines as possible in the pan-cancer analysis (even those not matching a TGCA type), values of the tissue factor were determined by looking at the GDSCdescription_1 label. Whereas for the cancer-specific analysis, only cell lines with a matching TCGA label were used. The tissue factors corresponding to ‘digestive_system’ and ‘urogenital_system’ were further sub-classified by using the more specific GDSCdescription_2 label. For all the tested gene-drug associations, effect size estimations vs. pooled standard deviation (quantified through the Cohen’s d), effect sizes vs. individual standard deviations (quantified through two different Glass deltas, for the feature positive and the feature negative population respectively), p-values and all the other statistical scores were obtained from the fitted models. A Benjamini–Hochberg multiple testing correction was finally applied to the resulting p-values (correcting together all those obtained in the pan-cancer analysis and on a cancer type basis those obtained in a given cancer-specific analysis). A p-value threshold of 10-3 and a false discovery rate threshold equal to 25% were finally used to call significant associations across all the performed analyses.
Concerns around the identity of cancer cell lines used in scientific research have been increasing over several years and was the topic of a recent editorial in Nature (pubmed 19225471: -, 2009). Cross-contamination has even been shown to be present in such widely used and supposedly well characterised groups of cell lines as the NCI60 set. For instance the NCI60 cell lines OVCAR-8 and NCI-ADR-RES have been shown to be over 97% identical using the Affymetrix SNP6.0 array in this laboratory - this result has been confirmed by multiple laboratories around the world and was recently reported in the scientific literature (pubmed 16504380: Liscovitch and Ravid, 2007). Two other such pairing are also present within the NCI60 series of lines - both M14/MDA-MB-435 (pubmed 17004106: Rae et al, 2007) and U251MG/SNB-19 have identities over 94% when compared using the SNP6 array.
Many of the cell line repositories are now providing short tandem repeat (STR) profiles of the lines they hold allowing identity of lines within the scientific community to be confirmed by a simple assay. We are currently confirming the identity of our cancer cell line set against those provided by the repositories, where possible. Each of the cell lines within our core set is being tested using a panel of 16 STRs (AmpFLSTR Identifiler KIT, ABI), which includes the 9 currently used by most of the cell line repositories (ATCC, Riken, JCRB and DSMZ). We are also providing a single nucleotide polymorphism (SNP) profile based on a panel of 63 SNPs assayed using the Sequenom Genetic Analyser which we use for in-house identity checking whenever a cell line is propagated.
The provision of STR profiles by the cell line repositories and of our in-house cell lines is ongoing and will be updated when appropriate.
Prior to accessing the STR or SNP datasets a Data Access Agreement must be completed: http://www.sanger.ac.uk/genetics/CGP/Archive
The username and password provided can be used to download the STR and SNP profiles for each cell line at the CGP Data Archive: http://www.sanger.ac.uk/research/projects/cancergenome/archive/#t_cl
Nature 2009;457;7232;935-6
PUBMED: 19225471; DOI: 10.1038/457935b
Division of Hematology/Oncology, Department of Internal Medicine, University of Michigan Medical Center, 1150 West Medical Center Drive, Med Sci I, Room 5323, Ann Arbor, MI 48109-0612, USA. jimmyrae@umich.edu
Background: The tissue of origin of the cell line MDA-MB-435 has been a matter of debate since analysis of DNA microarray data led Ross et al. (2000, Nat Genet 24(3):227-235) to suggest they might be of melanocyte origin due to their similarity to melanoma cell lines. We have previously shown that MDA-MB-435 cells maintained in multiple laboratories are of common origin to those used by Ross et al. and concluded that MDA-MB-435 cells are not a representative model for breast cancer. We could not determine, however, whether the melanoma-like properties of the MDA-MB-435 cell line are the result of misclassification or due to transdifferention to a melanoma-like phenotype.
Methods: We used karyotype, comparative genomic hybridization (CGH), and microsatalite polymorphism analyses, combined with bioinformatics analysis of gene expression and single nucleotide polymorphism (SNP) data, to test the hypothesis that the MDA-MB-435 cell line is derived from the melanoma cell line M14.
Results: We show that the MDA-MB-435 and M14 cell lines are essentially identical with respect to cytogenetic characteristics as well as gene expression patterns and that the minor differences found can be explained by phenotypic and genotypic clonal drift.
Conclusions: All currently available stocks of MDA-MB-435 cells are derived from the M14 melanoma cell line and can no longer be considered a model of breast cancer. These cells are still a valuable system for the study of cancer metastasis and the extensive literature using these cells since 1982 represent a valuable new resource for the melanoma research community.
Funded by: NIGMS NIH HHS: U-O1 GM61373
Breast cancer research and treatment 2007;104;1;13-9
PUBMED: 17004106; DOI: 10.1007/s10549-006-9392-8
Department of Biological Regulation, Weizmann Institute of Science, P.O.B. 26 Rehovot 76100, Israel. moti.liscovitch@weizmann.ac.il
Multidrug-resistant MCF-7 breast adenocarcinoma cells (originally named MCF-7/AdrR cells and later re-designated NCI/ADR-RES) have served as an important and widely used research tool during the last two decades. However, the real identity of these cells has been in doubt since 1998 and has since been debated. The origin of NCI/ADR-RES cells has now been revealed by SNP and karyotypic analyses, carried out at the Sanger Institute and the NCI, respectively. The results of these analyses, recently posted on the Web, show that NCI/ADR-RES cells are derived from OVCAR-8 ovarian adenocarcinoma cells. The case of NCI/ADR-RES cells highlights a wide-spread problem of cell line cross-contamination and misidentification. Fortunately, this is a tractable problem that can be avoided by scrupulous genotyping of cell stocks and adoption of a few simple rules in cell culture practice.
Cancer letters 2007;245;1-2;350-2
PUBMED: 16504380; DOI: 10.1016/j.canlet.2006.01.013
A volcano plot is used to visualise the correlation of drug sensitivity data with genetic events calculated using an ANOVA.
We use two different types of volcano plots to represent our data. Gene specific volcano plots represent the effect of a mutated gene (e.g. BRAF) on the responses to all drugs analysed. A drug-specific volcano plot represents how genomic changes influence response to a specific drug (e.g. BRAF inhibitor PLX4720).
In each volcano plot three pieces of data are represented:
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SA, UK; Wellcome Trust Sanger Institute, Wellcome Genome Campus, Cambridge CB10 1SA, UK.
Systematic studies of cancer genomes have provided unprecedented insights into the molecular nature of cancer. Using this information to guide the development and application of therapies in the clinic is challenging. Here, we report how cancer-driven alterations identified in 11,289 tumors from 29 tissues (integrating somatic mutations, copy number alterations, DNA methylation, and gene expression) can be mapped onto 1,001 molecularly annotated human cancer cell lines and correlated with sensitivity to 265 drugs. We find that cell lines faithfully recapitulate oncogenic alterations identified in tumors, find that many of these associate with drug sensitivity/resistance, and highlight the importance of tissue lineage in mediating drug response. Logic-based modeling uncovers combinations of alterations that sensitize to drugs, while machine learning demonstrates the relative importance of different data types in predicting drug response. Our analysis and datasets are rich resources to link genotypes with cellular phenotypes and to identify therapeutic options for selected cancer sub-populations.
Funded by: Cancer Research UK; European Research Council: 268626; Marie Curie; NCI NIH HHS: U24 CA143835; Wellcome Trust: 086375, 102696
Cell 2016;166;3;740-754
PUBMED: 27397505; PMC: 4967469; DOI: 10.1016/j.cell.2016.06.017
Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.
Alterations in cancer genomes strongly influence clinical responses to treatment and in many instances are potent biomarkers for response to drugs. The Genomics of Drug Sensitivity in Cancer (GDSC) database (www.cancerRxgene.org) is the largest public resource for information on drug sensitivity in cancer cells and molecular markers of drug response. Data are freely available without restriction. GDSC currently contains drug sensitivity data for almost 75 000 experiments, describing response to 138 anticancer drugs across almost 700 cancer cell lines. To identify molecular markers of drug response, cell line drug sensitivity data are integrated with large genomic datasets obtained from the Catalogue of Somatic Mutations in Cancer database, including information on somatic mutations in cancer genes, gene amplification and deletion, tissue type and transcriptional data. Analysis of GDSC data is through a web portal focused on identifying molecular biomarkers of drug sensitivity based on queries of specific anticancer drugs or cancer genes. Graphical representations of the data are used throughout with links to related resources and all datasets are fully downloadable. GDSC provides a unique resource incorporating large drug sensitivity and genomic datasets to facilitate the discovery of new therapeutic biomarkers for cancer therapies.
Funded by: Cancer Research UK; NIGMS NIH HHS: T32 GM071340; Wellcome Trust: 086357
Nucleic acids research 2013;41;Database issue;D955-61
PUBMED: 23180760; PMC: 3531057; DOI: 10.1093/nar/gks1111
Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK.
Clinical responses to anticancer therapies are often restricted to a subset of patients. In some cases, mutated cancer genes are potent biomarkers for responses to targeted agents. Here, to uncover new biomarkers of sensitivity and resistance to cancer therapeutics, we screened a panel of several hundred cancer cell lines--which represent much of the tissue-type and genetic diversity of human cancers--with 130 drugs under clinical and preclinical investigation. In aggregate, we found that mutated cancer genes were associated with cellular response to most currently available cancer drugs. Classic oncogene addiction paradigms were modified by additional tissue-specific or expression biomarkers, and some frequently mutated genes were associated with sensitivity to a broad range of therapeutic agents. Unexpected relationships were revealed, including the marked sensitivity of Ewing's sarcoma cells harbouring the EWS (also known as EWSR1)-FLI1 gene translocation to poly(ADP-ribose) polymerase (PARP) inhibitors. By linking drug activity to the functional complexity of cancer genomes, systematic pharmacogenomic profiling in cancer cell lines provides a powerful biomarker discovery platform to guide rational cancer therapeutic strategies.
Funded by: Howard Hughes Medical Institute; NHGRI NIH HHS: 1U54HG006097-01; NIGMS NIH HHS: P41GM079575-02; Wellcome Trust: 086357
Nature 2012;483;7391;570-5
PUBMED: 22460902; PMC: 3349233; DOI: 10.1038/nature11005
The Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA.
The systematic translation of cancer genomic data into knowledge of tumour biology and therapeutic possibilities remains challenging. Such efforts should be greatly aided by robust preclinical model systems that reflect the genomic diversity of human cancers and for which detailed genetic and pharmacological annotation is available. Here we describe the Cancer Cell Line Encyclopedia (CCLE): a compilation of gene expression, chromosomal copy number and massively parallel sequencing data from 947 human cancer cell lines. When coupled with pharmacological profiles for 24 anticancer drugs across 479 of the cell lines, this collection allowed identification of genetic, lineage, and gene-expression-based predictors of drug sensitivity. In addition to known predictors, we found that plasma cell lineage correlated with sensitivity to IGF1 receptor inhibitors; AHR expression was associated with MEK inhibitor efficacy in NRAS-mutant lines; and SLFN11 expression predicted sensitivity to topoisomerase inhibitors. Together, our results indicate that large, annotated cell-line collections may help to enable preclinical stratification schemata for anticancer agents. The generation of genetic predictions of drug response in the preclinical setting and their incorporation into cancer clinical trial design could speed the emergence of 'personalized' therapeutic regimens.
Funded by: NCI NIH HHS: R33 CA126674, R33 CA126674-04, R33 CA155554, R33 CA155554-02; NIH HHS: DP2 OD002750, DP2 OD002750-01
Nature 2012;483;7391;603-7
PUBMED: 22460905; PMC: 3320027; DOI: 10.1038/nature11003
Department of Bioinformatics and Computational Biology, Genentech Inc., 1 DNA Way, South San Francisco, California 94080, USA.
The use of large-scale genomic and drug response screening of cancer cell lines depends crucially on the reproducibility of results. Here we consider two previously published screens, plus a later critique of these studies. Using independent data, we show that consistency is achievable, and provide a systematic description of the best laboratory and analysis practices for future studies.
Nature 2016;533;7603;333-7
PUBMED: 27193678; DOI: 10.1038/nature17987
Large cancer cell line collections broadly capture the genomic diversity of human cancers and provide valuable insight into anti-cancer drug response. Here we show substantial agreement and biological consilience between drug sensitivity measurements and their associated genomic predictors from two publicly available large-scale pharmacogenomics resources: The Cancer Cell Line Encyclopedia and the Genomics of Drug Sensitivity in Cancer databases.
Funded by: Cancer Research UK: A16629; NHGRI NIH HHS: 1U54HG006097-01; Wellcome Trust: 086357, 102696
Nature 2015;528;7580;84-7
PUBMED: 26570998; DOI: 10.1038/nature15736
Center for the Science of Therapeutics, Broad Institute, Cambridge, Massachusetts.
Unlabelled: Identifying genetic alterations that prime a cancer cell to respond to a particular therapeutic agent can facilitate the development of precision cancer medicines. Cancer cell-line (CCL) profiling of small-molecule sensitivity has emerged as an unbiased method to assess the relationships between genetic or cellular features of CCLs and small-molecule response. Here, we developed annotated cluster multidimensional enrichment analysis to explore the associations between groups of small molecules and groups of CCLs in a new, quantitative sensitivity dataset. This analysis reveals insights into small-molecule mechanisms of action, and genomic features that associate with CCL response to small-molecule treatment. We are able to recapitulate known relationships between FDA-approved therapies and cancer dependencies and to uncover new relationships, including for KRAS-mutant cancers and neuroblastoma. To enable the cancer community to explore these data, and to generate novel hypotheses, we created an updated version of the Cancer Therapeutic Response Portal (CTRP v2).
Significance: We present the largest CCL sensitivity dataset yet available, and an analysis method integrating information from multiple CCLs and multiple small molecules to identify CCL response predictors robustly. We updated the CTRP to enable the cancer research community to leverage these data and analyses.
Funded by: Howard Hughes Medical Institute; NCI NIH HHS: RC2 CA148399, U01 CA176152, U01CA176152
Cancer discovery 2015;5;11;1210-23
PUBMED: 26482930; PMC: 4631646; DOI: 10.1158/2159-8290.CD-15-0235
The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA.
The high rate of clinical response to protein-kinase-targeting drugs matched to cancer patients with specific genomic alterations has prompted efforts to use cancer cell line (CCL) profiling to identify additional biomarkers of small-molecule sensitivities. We have quantitatively measured the sensitivity of 242 genomically characterized CCLs to an Informer Set of 354 small molecules that target many nodes in cell circuitry, uncovering protein dependencies that: (1) associate with specific cancer-genomic alterations and (2) can be targeted by small molecules. We have created the Cancer Therapeutics Response Portal (http://www.broadinstitute.org/ctrp) to enable users to correlate genetic features to sensitivity in individual lineages and control for confounding factors of CCL profiling. We report a candidate dependency, associating activating mutations in the oncogene β-catenin with sensitivity to the Bcl-2 family antagonist, navitoclax. The resource can be used to develop novel therapeutic hypotheses and to accelerate discovery of drugs matched to patients by their cancer genotype and lineage.
Funded by: NCI NIH HHS: K08 CA148887, R01 CA097061, R01 CA161061, RC2 CA148399, RC2-CA148399, U54 CA112962
Cell 2013;154;5;1151-61
PUBMED: 23993102; PMC: 3954635; DOI: 10.1016/j.cell.2013.08.003
Center for Molecular Therapeutics, Massachusetts General Hospital Cancer Center and Harvard Medical School, 149 13th Street, Charlestown, MA 02129, USA.
Efforts to discover new cancer drugs and predict their clinical activity are limited by the fact that laboratory models to test drug efficacy do not faithfully recapitulate this complex disease. One important model system for evaluating candidate anticancer agents is human tumour-derived cell lines. Although cultured cancer cells can exhibit distinct properties compared with their naturally growing counterparts, recent technologies that facilitate the parallel analysis of large panels of such lines, together with genomic technologies that define their genetic constitution, have revitalized efforts to use cancer cell lines to assess the clinical utility of new investigational cancer drugs and to discover predictive biomarkers.
Nature reviews. Cancer 2010;10;4;241-53
PUBMED: 20300105; DOI: 10.1038/nrc2820
Center for Molecular Therapeutics, Massachusetts General Hospital Cancer Center and Harvard Medical School, Charlestown, Massachusetts, USA.
Human cancer cell lines that can be propagated and manipulated in culture have proven to be excellent models for studying many aspects of gene function in cancer. In addition, they can provide a powerful system for assessing the molecular determinants of sensitivity to anticancer drugs. They have also been used in recent studies to identify genomic alterations and gene expression patterns that provide important insights into the genetic features that distinguish the properties of tumor cells associated with similar histologies. We have established a large repository of human tumor cell lines (>1000) corresponding to a wide variety of tumor types, and we have developed a methodology for profiling the collection for sensitivity to putative anticancer compounds. The rationale for examining tumor cell lines on this relatively large scale reflects accumulating evidence indicating that there is substantial genetic heterogeneity among human tumor cells-even those derived from tumors of similar histologies. Thus, to develop an accurate picture of the molecular determinants of tumorigenesis and response to therapy, it is essential to study the nature of such heterogeneity in a relatively large sample set. Here, we describe the methodologies used to conduct such screens and we describe a "proof-of-concept" screen using the EGFR kinase inhibitor, erlotinib (Tarceva), with a panel of lung cancer lines to demonstrate a correlation between EGFR mutations and drug sensitivity.
Methods in enzymology 2008;438;331-41
PUBMED: 18413259; DOI: 10.1016/S0076-6879(07)38023-3
Center for Molecular Therapeutics, Massachusetts General Hospital Cancer Center and Harvard Medical School, 149 13th Street, Charlestown, MA 02129, USA.
Kinase inhibitors constitute an important new class of cancer drugs, whose selective efficacy is largely determined by underlying tumor cell genetics. We established a high-throughput platform to profile 500 cell lines derived from diverse epithelial cancers for sensitivity to 14 kinase inhibitors. Most inhibitors were ineffective against unselected cell lines but exhibited dramatic cell killing of small nonoverlapping subsets. Cells with exquisite sensitivity to EGFR, HER2, MET, or BRAF kinase inhibitors were marked by activating mutations or amplification of the drug target. Although most cell lines recapitulated known tumor-associated genotypes, the screen revealed low-frequency drug-sensitizing genotypes in tumor types not previously associated with drug susceptibility. Furthermore, comparing drugs thought to target the same kinase revealed striking differences, predictive of clinical efficacy. Genetically defined cancer subsets, irrespective of tissue type, predict response to kinase inhibitors, and provide an important preclinical model to guide early clinical applications of novel targeted inhibitors.
Funded by: NCI NIH HHS: R01 CA115830
Proceedings of the National Academy of Sciences of the United States of America 2007;104;50;19936-41
PUBMED: 18077425; PMC: 2148401; DOI: 10.1073/pnas.0707498104
Title | Journal | PubMed ID |
---|---|---|
A novel heterogeneous network-based method for drug response prediction in cancer cell lines. | Sci Rep | PMID: 29463808 |
PharmacoDB: an integrative database for mining in vitro anticancer drug screening studies | Nucleic Acids Res. | PMCID: PMC5753377 |
EWS/FLI Confers Tumor Cell Synthetic Lethality to CDK12 Inhibition in Ewing Sarcoma | Cancer Cell | PMID: 29358035 |
Unearthing new genomic markers of drug response by improved measurement of discriminative power. | BMC Med Genomics | PMID: 29409485 |
DeSigN: connecting gene expression with therapeutics for drug repurposing and development | MBC Genomics | PMID: 28198666 |
Discordancy Partitioning for Validating Potentially Inconsistent Pharmacogenomic Studies | Sci Rep | PMID: 29123200 |
A tool for discovering drug sensitivity and gene expression associations in cancer cells | PLOS one | PMCID: PMC5409143 |
Systematic assessment of multi-gene predictors of pan-cancer cell line sensitivity to drugs exploiting gene expression data | F1000Research | PMID:28299173 |
Intra- and interspecies gene expression models for predicting drug response in canine osteosarcoma | BMC Bioinformatics | PMCID: PMC4759767 |
Reproducible pharmacogenomic profiling of cancer cell line panels | Nature | PMID:27193678 |
Pharmacogenomic agreement between two cancer cell line data sets | Nature | PMID:26570998 |
CATTLE (CAncer treatment treasury with linked evidence): An integrated knowledge base for personalized oncology research and practice | CPT Pharmacometrics Syst Pharmacol | PMID: 28296354 |
Precision and recall oncology: combining multiple gene mutations for improved identification of drug-sensitive tumours | Oncotarget | PMID:29228590 |
Suppression of 19S proteasome subunits marks emergence of an altered cell state in diverse cancers | Proc Natl Acad Sci | PMID:28028240 |
Pharmacoproteomic characterisation of human colon and rectal cancer | Mol Syst Biol. | PMID:29101300 |
Pharmaco-genomic investigations of organo-iridium anticancer complexes reveal novel mechanism of action | Metallomics | PMID:29131211 |
Colorectal Cancer Cell Line Proteomes Are Representative of Primary Tumors and Predict Drug Sensitivity | Gastroenterology | PMID:28625833 |
Integrated genomic analysis of recurrence-associated small non-coding RNAs in oesophageal cancer | Gut | PMID:27507904 |
Drug Sensitivity Assays of Human Cancer Organoid Cultures | Methods Mol Biol | PMID:27628132 |
Acquired savolitinib resistance in non-small cell lung cancer arises via multiple mechanisms that converge on MET-independent mTOR and MYC activation | Oncotarget | PMID:27472392 |
Cancer biomarker discovery is improved by accounting for variability in general levels of drug sensitivity in pre-clinical models | Genome Biol. | PMID: 27654937 |
Identifying anti-cancer drug response related genes using an integrative analysis of transcriptomic and genomic variations with cell line-based drug perturbations. | Oncotarget | PMCID: PMC4891048 |
Integrating Domain Specific Knowledge and Network Analysis to Predict Drug Sensitivity of Cancer Cell Lines. | PloS One | PMID: 27607242 |
Oncogenic KRAS triggers MAPK-dependent errors in mitosis and dependent errors in mitosis and MYC-dependent sensitivity to anti-mitotic agents | Scientific Reports | PMID: 27412232 |
Logic models to predict continuous outputs based on binary inputs with an application to personalized cancer therapy | Scientific Reports | PMID: 27876821 |
Consistency in large pharmacogenomic studies | Nature | PMID:27905415 |
Drug response consistency in CCLE and CGP | Nature | PMID:27905419 |
Consistency in drug response profiling | Nature | PMID:27905421 |
Cancer biomarker discovery is improved by accounting for variability in general levels of drug sensitivity in pre-clinical models | Genome Biology | PMCID: PMC5031330 |
PharmacoGx: an R package for analysis of large pharmacogenomic datasets | JBUON | PMID: 26656004 |
A Vulnerability of a Subset of Colon Cancers with Potential Clinical Utility | Cell | PMID: 27058664 |
HER2+ Cancer Cell Dependence on PI3K vs. MAPK Signaling Axes is determined by Expression of EGFR, ERBB3 and CDKN1B | PLoS Comput Biol | PMID: 27035903 |
Integrating heterogeneous drug sensitivity data from cancer pharmacogenomic studies. | Oncotarget | PMID: 27322211 |
The tandem duplicator phenotype as a distinct genomic configuration in cancer | PNAS PLUS | PMID: 27071093 |
Assessment of pharmacogenomic agreement | F1000Research | PMID: 27408686 |
Multilevel models improve precision and speed of IC50 estimates | Pharmacogenomics | PMID: 27180993 |
Identification of differential PI3K pathway target dependencies in T-cell acute lymphoblastic leukemia through a large cancer cell panel screen | Oncotarget | PMID: 26989080 |
Exploitation of the Apoptosis-Primed State of MYCN-Amplified Neuroblastoma to Develop a Potent and Specific Targeted Therapy Combination | Cancer Cell | PMID: 26859456 |
Integration of genomic, transcriptomic and proteomic data identifies two biologically distinct subtypes of invasive lobular breast cancer | Sci Rep | PMID: 26729235 |
Prediction of cancer cell sensitivity to natural products based on genomic and chemical properties | peerJ | PMID: 26644976 |
From drug response to target addiction scoring in cancer cell models | Disease Models & Mechanisms | PMID: 26438695 |
Predicting Anticancer Drug Responses Using a Dual-Layer Integrated Cell Line-Drug Network Model | PLOS one | PMCID: PMC4587957 |
Optimal Drug Prediction From Personal Genomics profiles | IEEE Journal of Biomedical and Health Informatics | PMID: 25781964 |
High selectivity of PI3Kβ inhibitors in SETD2-mutated renal clear cell carcinoma | J BUON | PMID: 26537074 |
Compromising the 19S proteasome complex protects cells from reduced flux through the proteasome | eLIFE | PMCID: PMC4551903 |
Integrated Analysis of Transcriptome in Cancer Patient-Derived Xenografts | PLOS one | PMID: 25951608 |
PI3Kb Inhibitor TGX221 Selectively Inhibits Renal Cell Carcinoma Cells with Both VHL and SETD2 mutations and Links Multiple Pathways | Scientific Reports | PMID: 25853938 |
Characterization of the Tyrosine Kinase-Regulated Proteome in Breast Cancer by Combined use of RNA interference (RNAi) and Stable Isotope Labeling with Amino Acids in Cell Culture (SILAC) Quantitative Proteomics | Molecular & Cellular Proteomics | PMID: 26089344 |
Loss of MLH1 confers resistance to PI3Kβ inhibitors in renal clear cell carcinoma with SETD2 mutation | Tumour Biol | PMID: 25528216 |
Denoising perturbation signatures reveals an actionable AKT-signaling gene module underlying a poor clinical outcome in endocrine treated ER+ breast cancer | Genome Biology | PMID: 25886003 |
Cell Index Database (CELLX): a web tool for cancer precision medicine | Pac Symp Biocomput. | PMID: 25592564 |
A comprehensive transcriptional portrait of human cancer cell lines | Nat Biotechnol. | PMID: 25485619 |
Predicting Response to Histone Deacetylase Inhibitors Using High-Throughput Genomics | J Natl Cancer Inst | PMCID: PMC4643634 |
Using drug response data to identify molecular effectors, and molecular "omic" data to identify candidate drugs in cancer. | Hum Genet. | PMID:25213708 |
Assessment of ABT-263 activity across a cancer cell line collection leads to a potent combination therapy for small-cell lung cancer | Proc Natl Acad | PMID: 25737542 |
High selectivity of PI3Kβ inhibitors in SETD2-mutated renal clear cell carcinoma | J BUON | PMID: 26537074 |
CDK4/6 inhibitor suppresses gastric cancer with CDKN2A mutation | Int J Clin Exp Med. | PMID: 26380006 |
Improved large-scale prediction of growth inhibition patterns using the NCI60 cancer cell line panel | Bioinformatics. | PMID: 26351271 |
Co-active receptor tyrosine kinases mitigate the effect of FGFR inhibitors in FGFR1-amplified lung cancers with low FGFR1 protein expression | Oncogene. | PMID: 26549034 |
Recursive Random Lasso (RRLasso) for Identifying Anti-Cancer Drug Targets | PLoS One. | PMID: 26544691 |
Data Mining Approaches for Genomic Biomarker Development: Applications Using Drug Screening Data from the Cancer Genome Project and the Cancer Cell Line Encyclopedia | PLoS One | PMID: 26132924 |
LIM kinase inhibitors disrupt mitotic microtubule organization and impair tumor cell proliferation | Oncotarget | PMID: 26540348 |
A Semi-Supervised Approach for Refining Transcriptional Signatures of Drug Response and Repositioning Predictions | PLoS ONE | PMID: 26452147 |
Designing of promiscuous inhibitors against pancreatic cancer cell lines | Scientific Reports | PMID: 24728108 |
Modeling RAS phenotype in colorectal cancer uncovers novel molecular traits of RAS dependency and improves prediction of response to targeted agents in patients | Clin Cancer Res. | PMID: 24170544 |
Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines. | Genome Biol. | PMID: 24580837 |
The REST gene signature predicts drug sensitivity in neuroblastoma cell lines and is significantly associated with neuroblastoma tumor stage | Int J Mol Sci. | PMCID: PMC4139778 |
Disruption of CRAF-Mediated MEK Activation Is Required for Effective MEK Inhibition in KRAS Mutant Tumors | Cancer Cell | PMID: 24746704 |
A community effort to assess and improve drug sensitivity prediction algorithms | Nature Biotechnology | PMID: 24880487 |
The evolving role of cancer cell line-based screens to define the impact of cancer genomes on drug response | Curr Opin Genet Dev. | PMID: 24607840 |
Inconsistency in large pharmacogenomic studies | Nature | PMID: 24284626 |
Targeting MYCN in neuroblastoma by BET bromodomain inhibition | Cancer Discov. | PMID: 26631615 |
Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells | Nucleic Acids Res. | PMID: 23180760 |
Machine learning prediction of cancer cell sensitivity to drugs based on genomic and chemical properties | PLoS One. | PMID: 23646105 |
VS-5584, a novel and highly selective PI3K/mTOR kinase inhibitor for the treatment of cancer | Mol Cancer Ther. | PMID: 23270925 |
Mcl-1 and FBW7 control a dominant survival pathway underlying HDAC and Bcl-2 inhibitor synergy in squamous cell carcinoma | Cancer Discov. | PMID: 23274910 |
Systematic identification of genomic markers of drug sensitivity in cancer cells | Nature | PMID: 22460902 |
MED12 controls the response to multiple cancer drugs through regulation of TGF-β receptor signaling | Cell | PMID: 23178117 |
Subtypes of primary colorectal tumors correlate with response to targeted treatment in colorectal cell lines | BMC Med Genomics | PMID: 23272949 |
Integrative and Personalized QSAR Analysis in Cancer by Kernelized Bayesian Matrix Factorization | PMID: 25046554 | |
Systematic Assessment of analytical methods for drug sensitivity predictions from cancer cell line data | Sage Bionetworks | PMID: 24297534 |
For questions regarding data, analyses and results please contact us by email at: cancerrxgene@sanger.ac.uk.
We are committed to working with collaborators to extend the scope of our research. We currently collaborate with more than 30 organisation from academia, biotech and the pharmaceutical industry. Please feel free to contact us to initiate a discussion on potential collaborations by email at: GDSCscreening@sanger.ac.uk .
Interested in receiving 'Genomics of Drug Sensitivity in Cancer' news and release information? Then sign up for the Translation-announce mailing list.
Contact us | Cookies policy | Terms & Conditions. This site is hosted by the Wellcome Sanger Institute.