2023-07-11 NHLBI BioData Catalyst Ecosystem Release Notes

Introduction

The 2023-07-11 release marks the fourteenth release for the NHLBI BioData Catalyst® (BDC) ecosystem. This release includes several new features, e.g., Faceted Search in BDC Powered by Seven Bridges (BDC-Seven Bridges), along with documentation to help new users get started on the ecosystem, e.g., updated WDL documentation in BDC Powered by Terra (BDC-Terra). This release also includes enhanced support for discovering what datasets are available via BDC Powered by Gen3. Please find more detail on the new features and user support materials in the sections below.

The 2023-07-11 data releases include the addition of various research projects related to COVID-19, lung development, platelet transfusion refractoriness, sickle cell anemia, asthma, pregnancy outcomes, and family health studies. Please refer to the Data Releases section below for information on upcoming data releases. A list of currently available data can be viewed on the Data page of the BDC website.

Significant new features

Faceted Search in BDC-Seven Bridges: Version 1 of Faceted Search has been deployed for all users on BDC-Seven Bridges. This feature enables users to query or filter any BDC ingested data in a faceted way to find files and form groups of files by searching characteristics such as authorization status, study accession number, type of data, etc. With the release of v1 Faceted Search, users can now more easily find data that is relevant to their research. Faceted Search is currently available for 10 datasets and will be expanded to all hosted datasets in the following quarter. The Faceted Search feature can be found under the Data drop-down menu.

BDC-Gen3 Metadata Being Updated to bring data from dbGaP FHIR database: BDC-Gen3’s Discovery Page (and underlying BDC-Gen3 Source of Truth Metadata API) allows unauthenticated users to discover what datasets are available in BDC. Fast Health Interoperability Resources (FHIR) is an Health Level Seven International (HL7) specification for Healthcare Interoperability. Last quarter, BDC-Gen3 worked to consume the new metadata from the dbGaP FHIR Server (as part of the officially defined data ingestion process). This quarter, BDC-Gen3’s Data Ingestion Pipeline has been updated to load FHIR metadata every new data release. The loaded metadata is available to all clients/users through BDC-Gen3’s Metadata API, and loaded metadata is viewable in BDC-Gen3’s Discovery Page.

New and Improved Genomic Filtering on BDC Powered by PIC-SURE (BDC-PIC-SURE): The Genomic Filtering modal on BDC-PIC-SURE has been updated to more accurately represent the relatedness between the various filtering fields. This includes the revamped “Variant consequence calculated” field, which includes different levels of severity and their associated consequences. Additionally, the “Selected Genomic Filters” section now more explicitly summarizes the filter criteria being applied.

Edit Queries Built in BDC-PIC-SURE Using the API: Researchers that created a cohort on BDC-PIC-SURE’s user interface can now edit that query’s parameters using Python or R code via the BDC-PIC-SURE API. This provides more flexibility for researchers wanting to refine or change their cohort after export and eliminates the need to return to the user interface.

New user support materials and documentation

Updated WDL documentation in BDC-Terra: Based on user feedback, Terra documentation has been expanded and updated to include: A new wdl-docs GitHub repository with a section dedicated to resources created by the WDL community, a new wdl-docs website to host the documentation from the new wdl-docs GitHub repository, updates to all existing WDL syntax documentation to match the WDL 1.0 spec, 17 new articles, 11 cookbook-style documents to teach users about specific use cases and provide example workflows, and 6 best practices documents to help users understand some of the grayer areas of coding in WDL. The documents are now available on the new wdl-docs GitHub repository.

New Code in “0_Export_from_UI” BDC-PIC-SURE API Examples: The example code has been updated to include new coding examples on how to use the BDC-PIC-SURE API to edit query parameters of a cohort built in the BDC-PIC-SURE user interface. These examples are available in both Python and R in both Jupyter and RStudio.

Data Releases

The table below highlights which studies were included in the 2023-07-11 data release. The Q2 data release included various research projects related to COVID-19, lung development, platelet transfusion refractoriness, sickle cell anemia, asthma, pregnancy outcomes, and family health studies. These include two studies from the COVID-19 Therapeutic Interventions and Vaccines initiative (ACTIV4a and ACTIV4c). There is a study on lung development (LungMAP) and another tackling platelet transfusion refractoriness in patients with severe thrombocytopenia using Eculizumab (DIR-Eculizumab). Other studies revolve around the use of hydroxyurea in children with sickle cell anemia (BABYHUG), the genetic epidemiology of asthma in Costa Rica (CRA), nulliparous pregnancy outcomes (nuMoM2b), multicenter study of hydroxyurea (MSH), and the Cleveland Family Study (CFS). The data is now available for access across the entire ecosystem.

Study Namephs I.D. #AcronymNew to BioData CatalystNew study version

Accelerating COVID-19 Therapeutic Interventions and Vaccines 4 ACUTE (ACTIV4a) v1.0, v1.1

phs002694.v3.p1.c1

COVID19-ACTIV4A_GRU

No

Yes

COVID-19 Post-hospital Thrombosis Prevention Study (ACTIV4c)

phs003063.v1.p1.c1

COVID19-ACTIV4C_GRU

No

Yes

Molecular Atlas of Lung Development (LungMAP)

phs001961.v2.p1.c1

LungMAP-MALD_GRU

Yes

Yes

Complement Inhibition Using Eculizumab to Overcome Platelet Transfusion Refractoriness in Patients with Severe Thrombocytopenia (DIR-Eculizumab)

phs003212.v1.p1.c1

DIR-Eculizumab_GRU

Yes

Yes

Hydroxyurea to Prevent Organ Damage in Children with Sickle Cell Anemia (BABYHUG)

phs002415.v1.p1.c1

BioLINCC-BabyHug_DS-SCD-IRB-RD

No

No

The Genetic Epidemiology of Asthma in Costa Rica (CRA)

phs000988.v4.p1.c1

topmed-CRA_DS-ASTHMA-IRB-MDS-RD

No

Yes

Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be (nuMoM2b)

phs002808.v1.p1.c1

topmed-NuMom2B_GRU-IRB

Yes

Yes

Multicenter Study of Hydroxyurea (MSH)

phs002348.v1.p1.c1

BioLINCC-MSH_GRU

No

No

The Cleveland Family Study (NSRR-CFS)

phs002715.v1.p1.c1

NSRR-NSRR-CFS_DS-HLBS-IRB-NPU

No

No

Planned Upcoming Data Releases

Study Name

phs I.D. #

Acronym

New to BioData Catalyst

New study version

NHLBI TOPMed: Boston Early-Onset COPD Study in the TOPMed Program (EOCOPD)

phs000946.v5.p1.c1

topmed-EOCOPD_DS-CS-RD

No

Yes

NHLBI TOPMed: The Cleveland Family Study (CFS)

phs000954.v4.p2.c1

topmed-CFS_DS-HLBS-IRB-NPU

No

Yes

NHLBI TOPMed: The Jackson Heart Study (JHS)

phs000964.v5.p1.c1

topmed-JHS_HMB-IRB-NPU

No

Yes

NHLBI TOPMed: The Jackson Heart Study (JHS)

phs000964.v5.p1.c2

topmed-JHS_DS-FDO-IRB-NPU

No

Yes

NHLBI TOPMed: The Jackson Heart Study (JHS)

phs000964.v5.p1.c3

topmed-JHS_HMB-IRB

No

Yes

NHLBI TOPMed: The Jackson Heart Study (JHS)

phs000964.v5.p1.c4

topmed-JHS_DS-FDO-IRB

No

Yes

NHLBI TOPMed: Whole Genome Sequencing and Related Phenotypes in the Framingham Heart Study (FHS)

phs000974.v5.p3.c1

topmed-FHS_HMB-IRB-MDS

No

Yes

NHLBI TOPMed: Whole Genome Sequencing and Related Phenotypes in the Framingham Heart Study (FHS)

phs000974.v5.p3.c2

topmed-FHS_HMB-IRB-NPU-MDS

No

Yes

NHLBI TOPMed: Heart and Vascular Health Study (HVH)

phs000993.v5.p2.c1

topmed-HVH_HMB-IRB-MDS

No

Yes

NHLBI TOPMed: Heart and Vascular Health Study (HVH)

phs000993.v5.p2.c2

topmed-HVH_DS-CVD-IRB-MDS

No

Yes

NHLBI TOPMed: The Vanderbilt AF Ablation Registry (VAFAR)

phs000997.v5.p2.c1

topmed-VAFAR_HMB-IRB

No

Yes

NHLBI TOPMed: The Vanderbilt Atrial Fibrillation Registry (VU)

phs001032.v6.p2.c1

topmed-VU_AF_GRU-IRB

No

Yes

NHLBI TOPMed: The Genetics and Epidemiology of Asthma in Barbados (BAGS)

phs001143.v4.p1.c1

topmed-BAGS_GRU-IRB

No

Yes

NHLBI TOPMed: Cleveland Clinic Atrial Fibrillation Study (CCAF)

phs001189.v4.p1.c1

topmed-CCAF_AF_GRU-IRB

No

Yes

NHLBI TOPMed: Cardiovascular Health Study (CHS)

phs001368.v3.p2.c1

topmed-CHS_HMB-MDS

No

Yes

NHLBI TOPMed: Cardiovascular Health Study (CHS)

phs001368.v3.p2.c2

topmed-CHS_HMB-NPU-MDS

No

Yes

NHLBI TOPMed: Cardiovascular Health Study (CHS)

phs001368.v3.p2.c3

topmed-CHS_DS-CVD-MDS

Yes

Yes

NHLBI TOPMed: Cardiovascular Health Study (CHS)

phs001368.v3.p2.c4

topmed-CHS_DS-CVD-NPU-MDS

No

Yes

NHLBI TOPMed: Diabetes Heart Study (DHS) African American Coronary Artery Calcification (AACAC)

phs001412.v3.p1.c1

topmed-AACAC_HMB-IRB-COL-NPU

No

Yes

NHLBI TOPMed: Diabetes Heart Study (DHS) African American Coronary Artery Calcification (AACAC)

phs001412.v3.p1.c2

topmed-AACAC_DS-DHD-IRB-COL-NPU

No

Yes

NHLBI TOPMed: MESA and MESA Family AA-CAC (MESA)

phs001416.v2.p1.c1

topmed-MESA_HMB

No

Yes

NHLBI TOPMed: MESA and MESA Family AA-CAC (MESA)

phs001416.v2.p1.c2

topmed-MESA_HMB-NPU

No

Yes

Clinical-trial of COVID-19 Convalescent Plasma in Outpatients (C3PO)

phs002752.v1.p1.c1

COVID19-C3PO_GRU

No

Yes

Collaborative Cohort of Cohorts for COVID-19 Research (C4R): Genetic Epidemiology of COPD Study (COPDGene)

phs002910.v1.p1.c1

COVID19-C4R_COPDGene_HMB

Yes

Yes

Collaborative Cohort of Cohorts for COVID-19 Research (C4R): Genetic Epidemiology of COPD Study (COPDGene)

phs002910.v1.p1.c2

COVID19-C4R_COPDGene_DS-CS

Yes

Yes

Collaborative Cohort of Cohorts for COVID-19 Research (C4R): Atherosclerosis Risk in Communities Study (ARIC)

phs002988.v1.p1.c1

COVID19-C4R_ARIC_HMB-IRB

Yes

Yes

Collaborative Cohort of Cohorts for COVID-19 Research (C4R): Framingham Heart Study (FHS)

phs002911.v1.p1.c1

COVID19-C4R_FHS_HMB-IRB-MDS

Yes

Yes

Collaborative Cohort of Cohorts for COVID-19 Research (C4R): Framingham Heart Study (FHS)

phs002911.v1.p1.c2

COVID19-C4R_FHS_HMB-IRB-NPU-MDS

Yes

Yes

Collaborative Cohort of Cohorts for COVID-19 Research (C4R): Severe Asthma Research Program (SARP)

phs002913.v1.p1.c1

COVID19-C4R_GRU-PUB-NPU

Yes

Yes

Collaborative Cohort of Cohorts for COVID-19 Research (C4R): Severe Asthma Research Program (SARP)

phs002913.v1.p1.c2

COVID19-C4R_GRU-PUB

Yes

Yes

Collaborative Cohort of Cohorts for COVID-19 Research (C4R): Severe Asthma Research Program (SARP)

phs002913.v1.p1.c3

COVID19-C4R_DS-AAI-PUB-NPU

Yes

Yes

Collaborative Cohort of Cohorts for COVID-19 Research (C4R): Severe Asthma Research Program (SARP)

phs002913.v1.p1.c4

COVID19-C4R_DS-AAI-PUB

Yes

Yes

Multi-Ethnic Study of Atherosclerosis (BioLINCC)

phs003288.v1.p1.c1

BioLINCC-MESA_HMB

Yes

Yes

Multi-Ethnic Study of Atherosclerosis (BioLINCC)

phs003288.v1.p1.c2

BioLINCC-MESA_HMB-NPU

Yes

Yes

For detailed platform release notes please consult the following resources:

BDC-Gen3 release notes BDC-Terra release notes BDC-Seven Bridges release notes BDC-PIC-SURE release notes BDC-Dockstore release notes

Last updated