Blog Layout

NCI Cancer Genomics Cloud Pilots

CURRENT NEEDS IN CANCER RESEARCH


The challenges posed by the need to disseminate, manage, and interpret large, multi-scale data pervade efforts to advance understanding of cancer biology and apply that knowledge in the clinic. For several years, the volume of data routinely generated by high-throughput research technologies has grown exponentially. The storage, transmission, and analysis of these data have become too costly for individual laboratories and most small to medium research organizations to support. For optimal progress to occur, access to large, valuable data collections and advanced computational capacity must be readily available to the widest possible audience.

On April 7, 2013, Dr. Harold Varmus and other members of the Institute's senior leadership issued a letter to NCI grantees seeking input on these and other computational challenges they encounter on an almost daily basis. Dr. Varmus stated that the NCI, as part of its ongoing investigations into next-generation computational capabilities to serve the research community, has begun exploring the possibility of creating one or more public "cancer knowledge clouds" in which data repositories would be co-located with advanced computing resources, thereby enabling researchers to bring their analytical tools and methods to the data. Reactions to this informal request for information were generally positive, with respondents focusing on six general themes: data access; computing capacity and infrastructure; data interoperability; training; usability; and governance.

Based in part on this information, Dr. George Komatsoulis, then interim director of the Center for Biomedical Informatics and Information Technology (CBIIT), which administers the National Cancer Informatics Program (NCIP), led the creation of a concept document describing a project to develop up to three cancer genomics cloud pilots for review by the cancer-research community. Dr. Komatsoulispresented the concept (time reference 05:58:00) at a joint meeting of the NCI Board of Scientific Advisors (BSA) and the National Cancer Advisory Board (NCAB) on June 24, 2013, where it received unanimous approval.

Soon after the concept was approved, NCI issued a Research and Development Sources Sought Notice providing a synopsis of requirements and asking respondents to submit capability statements. The deadline for submissions was July 24, 2013.

Simultaneously with beginning the procurement process, the NCIP established an IdeaScale site on August 8, 2013 to allow the community to contribute critical use cases that the cloud pilots will need to support. For more detailed information, consult the officialRequest for Information: IdeaScale for Cancer Genomics Cloud Pilots on FedBizOpps. A recent posting on the NCI Biomedical Informatics Blog explains the rationale behind the decision to use IdeaScale.

The IdeaScale site will be locked from additional input to coincide with the release of the BAA. It will remain open, however, as a reference for potential offerors.

Back to Top


THE CONTRACTING AND AWARD PROCESS


In preparation for the BAA, the NCI posted a pre-solicitation notice on November 25, 2013, on FedBizOpps that announced the online pre-proposal conference described above in the gray box. 

The BAA is the specific contract mechanism that will support development of the cancer genomics cloud pilots. The project will go through three phases:

  • Design
  • Implementation
  • Evaluation

The organizations selected to develop the clouds will be expected to collaborate with each other and with the groups managing the NCI Center for Cancer Genomics (CCG) Data Coordinating Center. NCIP activities are being conducted in concert with the CCG Data Coordinating Center, which will provide an authoritative public data set for use in the cloud pilots. Interchange among the organizations involved will help ensure adherence to a common set of data elements and vocabularies among the cloud pilots in support of operations that may span cloud implementations.


Areas of Focus


The research and development activities sponsored by the NCIP span four areas: Cancer Biology and Genomics, Clinical and Translational Research, Computational Genomics Research, and Semantic Infrastructure and Interoperability.


CANCER BIOLOGY AND GENOMICS


Multi-dimensional characterization data sets that compare tumor and normal tissue at the molecular level are providing unprecedented detail about the molecular alterations that lead to cancer. The ability to manage and analyze these data, and to integrate the results with the corresponding biological and clinical information, is providing new directions for developing treatment strategies that target the specific molecular changes in a patient’s disease. To provide support, CBIIT informatics priorities in the areas of cancer biology and genomics include

  • Management and analysis of primary data sets for cancer biology and genomics
  • Aggregation of translational research data and annotations
  • Dissemination and support for cancer biology and genomics data, tools, and standards.
  • Application of computational methods in support of knowledge discovery


CLINICAL AND TRANSLATIONAL RESEARCH


The Clinical and Translational Research domain provides targeted bioinformatics capabilities facilitating interoperability, collaboration, and integration across applications. Relevant applications are also designed to ease clinical trial reporting burdens through the consolidation of mechanisms for reporting and harmonizing implementable data standards for next-generation study designs. Specific project areas include the following:

  • Clinical Research Management, Adaptive Trial Management, and Enterprise Trial Reporting comprise centralized research subject management, centralized study data management (including accrual data), adverse-event reporting, next-generation clinical trials management, and enterprise portfolio management.
  • Clinical Research Data Modeling and Case Report Form Harmonization provide the semantic tools needed to integrate and interoperate data across multiple systems and entities at both NCI and the extramural and pharmacologic communities.
  • Imaging and Biospecimen Data Management systems address the need to manage large sets of clinical and preclinical research data in the form of diagnostic images and biospecimens. Both types of data sets require the association of multiple metadata elements with individual and across-study-specific images and specimens in order to allow researchers to share and manage not only research information but also valuable specimens that may be reused beyond the initial research.
  • Clinical Research Regulatory Management addresses NCI’s desire to streamline the management of regulatory requirements in order to accelerate the initiation of clinical trials research.


COMPUTATIONAL GENOMICS RESEARCH


Current biomedical informatics technologies permit the genome-wide generation of multidimensional molecular data sets that researchers can use to assess copy number alterations, nucleotide substitutions, insertions or deletions, rearrangements, and epigenetic changes. Furthermore, next-generation sequencing (NGS) provides researchers with complete gene and genome sequences.

The CBIIT Computational Genomics Research Group (CCRG) creates analytical methods and applications designed to integrate, display, and interpret such diverse, systems-wide data sets. The goal is to translate genetic and genomic observations into insights concerning the biology of human cancers, including disease etiology. As part of its collaborative development efforts, the CCRG has provided tools, analytical capacity, and bioinformatics support to researchers participating in The Cancer Genome Atlas (TCGA) and Therapeutically Applicable Research to Generate Effective Treatments (TARGET) projects as well as to investigators in the intramural NCI community.


INTEROPERABILITY AND SEMANTICS


Through a semantic infrastructure and an interoperability framework, CBIIT supports multidisciplinary science by enabling data integration across different specialties and institutions. The semantic infrastructure provides standard vocabularies, common data elements, clinical case-report forms, data models, and definitions. The interoperability framework follows a widely used approach of employing services that are independent of any particular information-technology platform.


Share this Article with others

08 Mar, 2024
The aims of our case-control study were (1) to develop an automated 3-dimensional (3D) Convolutional Neural Network (CNN) for detection of pancreatic ductal adenocarcinoma (PDA) on diagnostic computed tomography scans (CTs), (2) evaluate its generalizability on multi-institutional public data sets, (3) its utility as a potential screening tool using a simulated cohort with high pretest probability, and (4) its ability to detect visually occult preinvasive cancer on prediagnostic CTs.
08 Mar, 2024
Cancer Mutations Converge on a Collection of Protein Assemblies to Predict Resistance to Replication Stress
08 Mar, 2024
International cancer registries make real-world genomic and clinical data available, but their joint analysis remains a challenge. AACR Project GENIE, an international cancer registry collecting data from 19 cancer centers, makes data from >130,000 patients publicly available through the cBioPortal for Cancer Genomics (https://genie.cbioportal.org). For 25,000 patients, additional real-world longitudinal clinical data, including treatment and outcome data, are being collected by the AACR Project GENIE Biopharma Collaborative using the PRISSMM data curation model. Several thousand of these cases are now also available in cBioPortal. We have significantly enhanced the functionalities of cBioPortal to support the visualization and analysis of this rich clinico-genomic linked dataset, as well as datasets generated by other centers and consortia. Examples of these enhancements include (i) visualization of the longitudinal clinical and genomic data at the patient level, including timelines for diagnoses, treatments, and outcomes; (ii) the ability to select samples based on treatment status, facilitating a comparison of molecular and clinical attributes between samples before and after a specific treatment; and (iii) survival analysis estimates based on individual treatment regimens received. Together, these features provide cBioPortal users with a toolkit to interactively investigate complex clinico-genomic data to generate hypotheses and make discoveries about the impact of specific genomic variants on prognosis and therapeutic sensitivities in cancer.
08 Mar, 2024
The majority of disease-associated variants identified through genome-wide association studies are located outside of protein-coding regions. Prioritizing candidate regulatory variants and gene targets to identify potential biological mechanisms for further functional experiments can be challenging. To address this challenge, we developed FORGEdb, a standalone and web-based tool that integrates multiple datasets, delivering information on associated regulatory elements, transcription factor binding sites, and target genes for over 37 million variants. FORGEdb scores provide researchers with a quantitative assessment of the relative importance of each variant for targeted functional experiments.
By Bo Zhang 08 Mar, 2024
Cancer is a leading cause of morbidity and mortality worldwide. While progress has been made in the diagnosis, prognosis, and treatment of cancer patients, individualized and data-driven care remains a challenge. Artificial intelligence (AI), which is used to predict and automate many cancers, has emerged as a promising option for improving healthcare accuracy and patient outcomes. AI applications in oncology include risk assessment, early diagnosis, patient prognosis estimation, and treatment selection based on deep knowledge. Machine learning (ML), a subset of AI that enables computers to learn from training data, has been highly effective at predicting various types of cancer, including breast, brain, lung, liver, and prostate cancer. In fact, AI and ML have demonstrated greater accuracy in predicting cancer than clinicians. These technologies also have the potential to improve the diagnosis, prognosis, and quality of life of patients with various illnesses, not just cancer. Therefore, it is important to improve current AI and ML technologies and to develop new programs to benefit patients. This article examines the use of AI and ML algorithms in cancer prediction, including their current applications, limitations, and future prospects. Lead Author: Bo Zhang
By Claudio Luchini 08 Mar, 2024
Artificial intelligence (AI) is concretely reshaping the landscape and horizons of oncology, opening new important opportunities for improving the management of cancer patients. Analysing the AI-based devices that have already obtained the official approval by the Federal Drug Administration (FDA), here we show that cancer diagnostics is the oncology-related area in which AI is already entered with the largest impact into clinical practice. Furthermore, breast, lung and prostate cancers represent the specific cancer types that now are experiencing more advantages from AI-based devices. The future perspectives of AI in oncology are discussed: the creation of multidisciplinary platforms, the comprehension of the importance of all neoplasms, including rare tumours and the continuous support for guaranteeing its growth represent in this time the most important challenges for finalising the ‘AI-revolution’ in oncology. First Author: Claudio Luchini,
By Panayiotis Petousis, PhD 08 Mar, 2024
In the United States, end-stage kidney disease (ESKD) is responsible for high mortality and significant healthcare costs, with the number of cases sharply increasing in the past 2 decades. In this study, we aimed to reduce these impacts by developing an ESKD model for predicting its occurrence in a 2-year period.  Lead Author: Panayiotis Petousis, PhD
By Evan D. Muse 08 Mar, 2024
Transforming the cardiometabolic disease landscape: Multimodal AI-powered approaches in prevention and management Lead Author: Evan D. Muse
By Danielle S. Bitterman, MD 08 Mar, 2024
Current status and future applications of LLMs for cancer symptom management and call for cross-disciplinary collaboration that centers the needs of patients and caregivers L ead Author: Danielle S. Bitterman, MD
01 Mar, 2023
New computer systems aim to peer inside our heads—and to help us fix what they find there In the nineteen-sixties, Joseph Weizenbaum, a computer scientist at M.I.T., created a computer program called Eliza. It was designed to simulate Rogerian therapy, in which the patient directs the conversation and the therapist often repeats her language back to her: U ser : My boyfriend made me come here. E liza : Your boyfriend made you come here. U ser : He says I’m depressed much of the time. E liza : I am sorry to hear you are depressed. U ser : It’s true. I am unhappy. E liza : Do you think coming here will help you not to be unhappy? Weizenbaum made Eliza as satire. He doubted that computers could simulate meaningful human interaction. He was alarmed, therefore, when many people who tried the program found it both useful and captivating. His own secretary asked him to leave the room so that she could spend time alone with Eliza. Worse, doctors saw it as a potentially transformative tool. “Several hundred patients an hour could be handled by a computer system designed for this purpose,” three psychiatrists wrote in The Journal of Nervous and Mental Disease, in 1966. “The human therapist, involved in the design and operation of this system, would not be replaced, but would become a much more efficient man since his efforts would no longer be limited to the one-to-one patient-therapist ratio as now exists.” Weizenbaum became an outspoken critic of artificial intelligence. “But the genie was out of the bottle,” Brian Christian, who chronicled the episode in his book “The Most Human Human,” told me. A few years later, a Stanford psychiatrist named Kenneth Colby created Parry, a program that attempted to simulate the language of a person with paranoid schizophrenia, to train students before they cared for real patients. Psychiatrists given transcripts of therapy sessions often couldn’t tell the difference between Parry and humans; in this narrow sense, the chatbot passed the Turing test. In 1972, Parry and Eliza met up for a therapy session: P arry : Do you know anything about bookies? E liza : What makes you think I know anything about bookies? P arry : A bookie didn’t pay me off once. E liza : What does that suggest to you? P arry : I beat up the bookie who did not pay off, and I thought he might use his friends in the underworld to get even with me.  Over time, programmers developed Jabberwacky, Dr. Sbaitso, and alice (the Artificial Linguistic Internet Computer Entity). Exchanges with these chatbots were often engaging, sometimes comical, and occasionally nonsensical. But the idea that computers could serve as human confidants, expanding therapy’s reach beyond the limits of its overworked practitioners, persisted through the decades. In 2017, Alison Darcy, a clinical research psychologist at Stanford, founded Woebot, a company that provides automated mental-health support through a smartphone app. Its approach is based on cognitive behavioral therapy, or C.B.T.—a treatment that aims to change patterns in people’s thinking. The app uses a form of artificial intelligence called natural language processing to interpret what users say, guiding them through sequences of pre-written responses that spur them to consider how their minds could work differently. When Darcy was in graduate school, she treated dozens of hospitalized patients using C.B.T.; many experienced striking improvements but relapsed after they left the hospital. C.B.T. is “best done in small quantities over and over and over again,” she told me. In the analog world, that sort of consistent, ongoing care is hard to find: more than half of U.S. counties don’t have a single psychiatrist, and, last year, a survey conducted by the American Psychological Association found that sixty per cent of mental-health practitioners don’t have openings for new patients. “No therapist can be there with you all day, every day,” Darcy said. Although the company employs only about a hundred people, it has counseled nearly a million and a half, the majority of whom live in areas with a shortage of mental-health providers. Link to original article on The New Yorker
Share by: