Blog Layout

TCGA's Pan-Cancer Efforts and Expansion to Include Whole Genome Sequence

Carolyn Hutter, Ph.D.

Program Director of the Division of Genomic Medicine at the National Human Genome Research Institute (NHGRI)

In 2013, TCGA’s ‘Pan-Cancer’ analysis on over 5,000 cases from 12 tumor projects (see figure) was featured in Nature Genetics with a complementary focus website, which presented over 15 papers and 5 thematic threads. The threads highlight key findings for mutational drivers, network models, exposures and pathogens, data discovery and future directions.


TCGA is currently expanding efforts to characterize commonalities, differences, and emergent themes across cancer types in collaboration with the International Cancer Genome Consortium (ICGC) through the Pan-Cancer Analysis of Whole Genomes (PAWG) project. The goal is to analyze the genomes, including genome-wide sequence data, of approximately 2000 pairs of tumor and normal samples, and integrate those results with clinical and other molecular data on the same cases. The genomic sequence data will be available to the research community through the TCGA Data PortalCGHub, and the ICGC Data Repository. Investigators around the globe will lead analysis in a number of scientific areas, including: integration of transcriptome and genome analyses, patterns of structural variations, novel somatic mutation-calling methods, evolution and heterogeneity, and germline cancer genome variation.

Figure 1: Integrated data set for comparing and contrasting multiple tumor types. The Cancer Genome Atlas Research Network, Weinstein, J.N., Collisson, E.A., Mills, G.B., Shaw, K.M., Ozenberger, B.A., Ellrott, K., Shmulevich, I., Sander, C., and Stuart, J.M. (2013) The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet. doi:10.1038/ng.2764. Read the full article.


The TCGA/ICGC PAWG will capitalize on existing TCGA data and infrastructure, and will incorporate information from other NIH-funded projects, such as the Encyclopedia of DNA Elements (ENCODE), the Genotype-Tissue Expression (GTEx) Program and the Roadmap Epigenomics Program. As with other TCGA Pan-Cancer efforts to date, this work represents a significant effort and underscores the importance of team science. Using integrative approaches, investigators will be better able to distinguish the signal from the noise and focus on functionally relevant genomic alterations, pathways and mechanisms. However, whole genome analysis also poses a number of key challenges and research needs, such as improved approaches for computing on petabytes of data, more robust standards for cross-project mutation calling, and more effective methods for analyzing and interpreting non-coding variation.


Overall, combining whole genome sequence analysis and comprehensive genomic characterization in this coordinated cross-cancer analysis will enhance our knowledge of cancer genomics and biology. Such work will move TCGA closer towards our goal to improve our ability to diagnose, treat and prevent cancer. Furthermore, the advances in this project will extend beyond cancer research, as the improved capabilities in whole genome sequence analysis and interpretation will be applicable to studies of other diseases and of biology in general.



Share this Article with others

08 Mar, 2024
The aims of our case-control study were (1) to develop an automated 3-dimensional (3D) Convolutional Neural Network (CNN) for detection of pancreatic ductal adenocarcinoma (PDA) on diagnostic computed tomography scans (CTs), (2) evaluate its generalizability on multi-institutional public data sets, (3) its utility as a potential screening tool using a simulated cohort with high pretest probability, and (4) its ability to detect visually occult preinvasive cancer on prediagnostic CTs.
08 Mar, 2024
Cancer Mutations Converge on a Collection of Protein Assemblies to Predict Resistance to Replication Stress
08 Mar, 2024
International cancer registries make real-world genomic and clinical data available, but their joint analysis remains a challenge. AACR Project GENIE, an international cancer registry collecting data from 19 cancer centers, makes data from >130,000 patients publicly available through the cBioPortal for Cancer Genomics (https://genie.cbioportal.org). For 25,000 patients, additional real-world longitudinal clinical data, including treatment and outcome data, are being collected by the AACR Project GENIE Biopharma Collaborative using the PRISSMM data curation model. Several thousand of these cases are now also available in cBioPortal. We have significantly enhanced the functionalities of cBioPortal to support the visualization and analysis of this rich clinico-genomic linked dataset, as well as datasets generated by other centers and consortia. Examples of these enhancements include (i) visualization of the longitudinal clinical and genomic data at the patient level, including timelines for diagnoses, treatments, and outcomes; (ii) the ability to select samples based on treatment status, facilitating a comparison of molecular and clinical attributes between samples before and after a specific treatment; and (iii) survival analysis estimates based on individual treatment regimens received. Together, these features provide cBioPortal users with a toolkit to interactively investigate complex clinico-genomic data to generate hypotheses and make discoveries about the impact of specific genomic variants on prognosis and therapeutic sensitivities in cancer.
08 Mar, 2024
The majority of disease-associated variants identified through genome-wide association studies are located outside of protein-coding regions. Prioritizing candidate regulatory variants and gene targets to identify potential biological mechanisms for further functional experiments can be challenging. To address this challenge, we developed FORGEdb, a standalone and web-based tool that integrates multiple datasets, delivering information on associated regulatory elements, transcription factor binding sites, and target genes for over 37 million variants. FORGEdb scores provide researchers with a quantitative assessment of the relative importance of each variant for targeted functional experiments.
By Bo Zhang 08 Mar, 2024
Cancer is a leading cause of morbidity and mortality worldwide. While progress has been made in the diagnosis, prognosis, and treatment of cancer patients, individualized and data-driven care remains a challenge. Artificial intelligence (AI), which is used to predict and automate many cancers, has emerged as a promising option for improving healthcare accuracy and patient outcomes. AI applications in oncology include risk assessment, early diagnosis, patient prognosis estimation, and treatment selection based on deep knowledge. Machine learning (ML), a subset of AI that enables computers to learn from training data, has been highly effective at predicting various types of cancer, including breast, brain, lung, liver, and prostate cancer. In fact, AI and ML have demonstrated greater accuracy in predicting cancer than clinicians. These technologies also have the potential to improve the diagnosis, prognosis, and quality of life of patients with various illnesses, not just cancer. Therefore, it is important to improve current AI and ML technologies and to develop new programs to benefit patients. This article examines the use of AI and ML algorithms in cancer prediction, including their current applications, limitations, and future prospects. Lead Author: Bo Zhang
By Claudio Luchini 08 Mar, 2024
Artificial intelligence (AI) is concretely reshaping the landscape and horizons of oncology, opening new important opportunities for improving the management of cancer patients. Analysing the AI-based devices that have already obtained the official approval by the Federal Drug Administration (FDA), here we show that cancer diagnostics is the oncology-related area in which AI is already entered with the largest impact into clinical practice. Furthermore, breast, lung and prostate cancers represent the specific cancer types that now are experiencing more advantages from AI-based devices. The future perspectives of AI in oncology are discussed: the creation of multidisciplinary platforms, the comprehension of the importance of all neoplasms, including rare tumours and the continuous support for guaranteeing its growth represent in this time the most important challenges for finalising the ‘AI-revolution’ in oncology. First Author: Claudio Luchini,
By Panayiotis Petousis, PhD 08 Mar, 2024
In the United States, end-stage kidney disease (ESKD) is responsible for high mortality and significant healthcare costs, with the number of cases sharply increasing in the past 2 decades. In this study, we aimed to reduce these impacts by developing an ESKD model for predicting its occurrence in a 2-year period.  Lead Author: Panayiotis Petousis, PhD
By Evan D. Muse 08 Mar, 2024
Transforming the cardiometabolic disease landscape: Multimodal AI-powered approaches in prevention and management Lead Author: Evan D. Muse
By Danielle S. Bitterman, MD 08 Mar, 2024
Current status and future applications of LLMs for cancer symptom management and call for cross-disciplinary collaboration that centers the needs of patients and caregivers L ead Author: Danielle S. Bitterman, MD
01 Mar, 2023
New computer systems aim to peer inside our heads—and to help us fix what they find there In the nineteen-sixties, Joseph Weizenbaum, a computer scientist at M.I.T., created a computer program called Eliza. It was designed to simulate Rogerian therapy, in which the patient directs the conversation and the therapist often repeats her language back to her: U ser : My boyfriend made me come here. E liza : Your boyfriend made you come here. U ser : He says I’m depressed much of the time. E liza : I am sorry to hear you are depressed. U ser : It’s true. I am unhappy. E liza : Do you think coming here will help you not to be unhappy? Weizenbaum made Eliza as satire. He doubted that computers could simulate meaningful human interaction. He was alarmed, therefore, when many people who tried the program found it both useful and captivating. His own secretary asked him to leave the room so that she could spend time alone with Eliza. Worse, doctors saw it as a potentially transformative tool. “Several hundred patients an hour could be handled by a computer system designed for this purpose,” three psychiatrists wrote in The Journal of Nervous and Mental Disease, in 1966. “The human therapist, involved in the design and operation of this system, would not be replaced, but would become a much more efficient man since his efforts would no longer be limited to the one-to-one patient-therapist ratio as now exists.” Weizenbaum became an outspoken critic of artificial intelligence. “But the genie was out of the bottle,” Brian Christian, who chronicled the episode in his book “The Most Human Human,” told me. A few years later, a Stanford psychiatrist named Kenneth Colby created Parry, a program that attempted to simulate the language of a person with paranoid schizophrenia, to train students before they cared for real patients. Psychiatrists given transcripts of therapy sessions often couldn’t tell the difference between Parry and humans; in this narrow sense, the chatbot passed the Turing test. In 1972, Parry and Eliza met up for a therapy session: P arry : Do you know anything about bookies? E liza : What makes you think I know anything about bookies? P arry : A bookie didn’t pay me off once. E liza : What does that suggest to you? P arry : I beat up the bookie who did not pay off, and I thought he might use his friends in the underworld to get even with me.  Over time, programmers developed Jabberwacky, Dr. Sbaitso, and alice (the Artificial Linguistic Internet Computer Entity). Exchanges with these chatbots were often engaging, sometimes comical, and occasionally nonsensical. But the idea that computers could serve as human confidants, expanding therapy’s reach beyond the limits of its overworked practitioners, persisted through the decades. In 2017, Alison Darcy, a clinical research psychologist at Stanford, founded Woebot, a company that provides automated mental-health support through a smartphone app. Its approach is based on cognitive behavioral therapy, or C.B.T.—a treatment that aims to change patterns in people’s thinking. The app uses a form of artificial intelligence called natural language processing to interpret what users say, guiding them through sequences of pre-written responses that spur them to consider how their minds could work differently. When Darcy was in graduate school, she treated dozens of hospitalized patients using C.B.T.; many experienced striking improvements but relapsed after they left the hospital. C.B.T. is “best done in small quantities over and over and over again,” she told me. In the analog world, that sort of consistent, ongoing care is hard to find: more than half of U.S. counties don’t have a single psychiatrist, and, last year, a survey conducted by the American Psychological Association found that sixty per cent of mental-health practitioners don’t have openings for new patients. “No therapist can be there with you all day, every day,” Darcy said. Although the company employs only about a hundred people, it has counseled nearly a million and a half, the majority of whom live in areas with a shortage of mental-health providers. Link to original article on The New Yorker
Share by: