2024-07-02 NHLBI BioData Catalyst Ecosystem Release Notes

Introduction

The 2024-07-02 release marks the 18th release for the NHLBI BioData Catalyst® (BDC) ecosystem. This release includes several new features (e.g., an expanded workflow cost estimator, cascading authorization from parent to child studies, and DOIs at the dataset level). Please find more detail on the new features and user support materials in the sections below.

The 2024-07-02 data releases include the addition of research on atrial fibrillation, asthma, sickle cell disease, atherosclerosis, and more. Please refer to the Data Releases section below for more information as well as the Data page on the BDC website.

Significant new features

Fixed Interoperability on BioData Catalyst Powered By Seven Bridges (BDC-Seven Bridges): BDC-Seven Bridges completed work on updating interoperability functionality. The initial release of the project-based data download restriction functionality inadvertently interfered with DRS data interoperability between BDC-Seven Bridges and other ecosystems such as CAVATICA. This unintentionally re-siloed data on those systems and runs counter to the overarching NIH data ecosystem goals of making data available to users across NIH institute/system boundaries.

Workflow Cost Estimator Expansion: A feature that enables users to estimate analysis costs before running has been expanded to three new workflows on BDC-Seven Bridges: 1) Cyrius, a tool to genotype CYP2D6 from WGS BAM or CRAM files, 2) kallisto quant, a tool to quantify RNA-seq data, and 3) BEDTools Coverage, a tool that computes both the depth and breadth of coverage of features in file B on the features in file A, useful for comparing WGS files. Users can filter tools based on the interactive cost estimator. See here for documentation.

Support Cascading authorization from dbGaP parent to child studies: Gen3 has updated the authorization process in BDC to enable a researcher with access to a dbGaP parent study to automatically gain access to relevant child studies. The authorization process as it existed previously in BDC expected dbGaP to explicitly grant access to both parent and its associated substudies individually. Since dbGaP did not provide explicit access for child studies, users were not able to access these child studies without additional authorization requested manually. With the implementation of support for cascading of authorization from parent to child study, a researcher with access to a dbGaP parent study will also gain access to relevant child studies in BDC, eliminating the need for any manual authorization process.

Implementation of DOIs at Dataset level: A digital object identifier (DOI) is a persistent identifier or handle used to identify objects uniquely, standardized by the International Organization for Standardization (ISO). In BDC, DOIs have been created and made available at the dataset level to assign a persistent identifier in a standard format. The DOIs are available via the Gen3 discovery page as well as the API. DataCite was used as the registration service. Going forward, every BDC dataset will have a DOI minted as part of the data ingestion process. For a user, having assigned DOIs to datasets will promote research reproducibility and data FAIR-ness.

View Stigmatizing Variables in PIC-SURE Open Access: Researchers can now view all variables, including stigmatizing variables, that are relevant to their search. Though these variables are not filterable in Open Access to protect participant data, this allows researchers to better understand what information is present in BDC. For more information about stigmatizing variables, please visit the publicly available GitHub repository.

Data Releases

The table below highlights which studies were included in the 2024-07-02 data release.

The latest release includes studies from NHLBI TOPMed projects such as Partners HealthCare Biobank, Novel Risk Factors for the Development of Atrial Fibrillation in Women, and the Study of Asthma Phenotypes and Pharmacogenomic Interactions by Race-Ethnicity (SAPPHIRE). New versions of studies like Walk-PHaSST Sickle Cell Disease, the Malmo Preventive Project, and the Johns Hopkins University School of Medicine Atrial Fibrillation Genetics Study are also featured. Additionally, the release includes updates to studies like Outcome Modifying Genes in Sickle Cell Disease (OMG) and the Vanderbilt University BioVU Atrial Fibrillation Genetics Study. The Collaborative Cohort of Cohorts for COVID-19 Research (C4R) and NIH RECOVER projects are also part of this release, including studies from the Hispanic Community Health Study/Study of Latinos and the Multi-Ethnic Study of Atherosclerosis.

The data is now available for access across the entire ecosystem.

Study Name
phs I.D. #
Acronym
New to BioData Catalyst
New study version

NHLBI TOPMed: Partners HealthCare Biobank

phs001024.v6.p1.c1

topmed-PARTNERS_HMB

No

Yes

NHLBI TOPMed: Novel Risk Factors

phs001040.v6.p1.c1

topmed-WGHS_HMB

No

Yes

NHLBI TOPMed: Study of Asthma Phenotypes and Pharmacogenomic Interactions by Race-Ethnicity (SAPPHIRE)

phs001467.v2.p2.c1

topmed-SAPPHIRE_asthma_HMB-COL

No

Yes

NHLBI TOPMed: Walk-PHaSST Sickle Cell Disease (SCD)

phs001514.v2.p1.c1

topmed-Walk_PHaSST_SCD_HMB-IRB-PUB-COL-NPU-MDS-GSO

No

Yes

NHLBI TOPMed: Walk-PHaSST Sickle Cell Disease (SCD)

phs001514.v2.p1.c2

otopmed-Walk_PHaSST_SCD_DS-SCD-IRB-PUB-COL-NPU-MDS-RDN

No

Yes

NHLBI TOPMed - NHGRI CCDG: Malmo Preventive Project (MPP)

phs001544.v3.p1.c1

topmed-MPP_HMB-NPU-MDS

No

Yes

NHLBI TOPMed - NHGRI CCDG: The Johns Hopkins University School of Medicine Atrial Fibrillation Genetics Study

phs001598.v3.p1.c1

topmed-JHU_AF_HMB-NPU-MDS

No

Yes

NHLBI TOPMed: Outcome Modifying Genes in Sickle Cell Disease (OMG)

phs001608.v2.p1.c1

topmed-OMG_SCD_DS-SCD-IRB-PUB-COL-MDS-RD

No

Yes

NHLBI TOPMed - NHGRI CCDG: The Vanderbilt University BioVU Atrial Fibrillation Genetics Study

phs001624.v3.p2.c1

topmed-BioVU_AF_HMB-GSO

No

Yes

NHLBI TOPMed: Genetic Causes of Complex Pediatric Disorders - Asthma (GCPD-A)

phs001661.v3.p1.c1

topmed-GCPD-A_DS-ASTHMA-GSO

No

Yes

NHLBI TOPMed: Lung Tissue Research Consortium (LTRC)

phs001662.v2.p1.c2

topmed-LTRC_HMB-MDS

No

Yes

NHLBI TOPMed: Pulmonary Hypertension and the Hypoxic Response in SCD (PUSH)

phs001682.v2.p1.c1

topmed-PUSH_SCD_DS-SCD-IRB-PUB-COL

No

Yes

NHLBI TOPMed - NHGRI CCDG: Groningen Genetics of Atrial Fibrillation (GGAF) Study

phs001725.v2.p1.c1

topmed-GGAF_GRU

No

Yes

NHLBI TOPMed: Childhood Asthma Management Program (CAMP)

phs001726.v2.p1.c1

topmed-CAMP_DS-AST-COPD

No

Yes

NHLBI TOPMed: Best ADd-on Therapy Giving Effective Response (BADGER)

phs001728.v3.p1.c2

topmed-CARE_BADGER_DS-ASTHMA-IRB-COL

No

Yes

NHLBI TOPMed: Characterizing the Response to a Leukotriene Receptor Antagonist and an Inhaled Corticosteroid (CLIC)

phs001729.v3.p1.c2

topmed-CARE_CLIC_DS-ASTHMA-IRB-COL

No

Yes

NHLBI TOPMed: Pediatric Asthma Controller Trial (PACT)

phs001730.v2.p1.c2

topmed-CARE_PACT_DS-ASTHMA-IRB-COL

No

Yes

NHLBI TOPMed: TReating Children to Prevent EXacerbations of Asthma (TREXA)

phs001732.v2.p1.c2

topmed-CARE_TREXA_DS-ASTHMA-IRB-COL

No

Yes

Collaborative Cohort of Cohorts for COVID-19 Research (C4R): Hispanic Community Health Study/Study of Latinos (HCHS/SOL)

phs002908.v1.p1.c1

COVID19-C4R_HCHS_SOL_HMB-NPU

Yes

Yes

Collaborative Cohort of Cohorts for COVID-19 Research (C4R): Hispanic Community Health Study/Study of Latinos (HCHS/SOL)

phs002908.v1.p1.c2

COVID19-C4R_HCHS_SOL_HMB

Yes

Yes

Collaborative Cohort of Cohorts for COVID-19 Research (C4R): Multi-Ethnic Study of Atherosclerosis (MESA)

phs003017.v1.p1.c1

COVID19-C4R_MESA_HMB

Yes

Yes

Collaborative Cohort of Cohorts for COVID-19 Research (C4R): Multi-Ethnic Study of Atherosclerosis (MESA)

phs003017.v1.p1.c2

COVID19-C4R_MESA_HMB-NPU

Yes

Yes

NIH RECOVER: A Multi-Site Observational Study of Post-Acute Sequelae of SARS-CoV-2 Infection in Adults

phs003463.v2.p2.c1

RECOVER-RC-Adult_GRU

No

Yes

Heart Failure Network: Functional Impact of GLP-1 for Heart Failure Treatment (HFN FIGHT-BioLINCC)

phs003542.v1.p1.c1

BioLINCC_BL_HFN-FIGHT_GRU

No

Yes

Action to Control Cardiovascular Risk in Diabetes (ACCORD-BioLINCC)

phs003551.v1.p1.c1

BioLINCC-BL_ACCORD_GRU

No

Yes

Action to Control Cardiovascular Risk in Diabetes (ACCORD - Imaging)

phs003562.v2.p1.c1

imaging-ACCORD_GRU

No

Yes

Systolic Blood Pressure Intervention Trial (SPRINT-Imaging)

phs003566.v2.p1.c1

imaging-SPRINT_GRU

No

Yes

Framingham Heart Study-Cohort (FHS-Cohort) - Imaging

phs003593.v1.p1.c1

Imaging-img_FHS_HMB-IRB-MDS

No

Yes

Framingham Heart Study-Cohort (FHS-Cohort) - Imaging

phs003593.v1.p1.c2

Imaging-img_FHS_HMB-IRB-NPU-MDS

No

Yes

Planned Upcoming Data Releases

Study Name
phs I.D. #
Acroynm
New to BioData Catalyst
New study version

NHLBI TOPMed: Pharmacogenomics of Hydroxyurea in Sickle Cell Disease (PharmHU)

phs001466.v2.p1.c1

topmed-pharmHU_HMB

No

Yes

HLBI TOPMed: Pharmacogenomics of Hydroxyurea in Sickle Cell Disease (PharmHU)

phs001466.v2.p1.c2

topmed-pharmHU_DS-SCD-RD

No

Yes

NHLBI TOPMed: Pharmacogenomics of Hydroxyurea in Sickle Cell Disease (PharmHU)

phs001466.v2.p1.c3

topmed-pharmHU_DS-SCD

No

Yes

NHLBI TOPMed: Partners HealthCare Biobank

phs001024.v6.p1.c1

topmed-PARTNERS_HMB

No

Yes

NHLBI TOPMed - NHGRI CCDG: The Vanderbilt University BioVU Atrial Fibrillation Genetics Study

phs001624.v3.p2.c1

topmed-BioVU_AF_HMB-GSO

No

Yes

NHLBI TOPMed: Novel Risk Factors for the Development of Atrial Fibrillation in Women

phs001040.v6.p1.c1

topmed-WGHS_HMB

No

Yes

NHLBI TOPMed - NHGRI CCDG: The Johns Hopkins University School of Medicine Atrial Fibrillation Genetics Study

phs001598.v3.p1.c1

topmed-JHU_AF_HMB-NPU-MDS

No

Yes

NHLBI TOPMed - NHGRI CCDG: Malmo Preventive Project (MPP)

phs001544.v3.p1.c1

topmed-MPP_HMB-NPU-MDS

No

Yes

NHLBI TOPMed: Pathways to Immunologically Mediated Asthma (PIMA)

phs001727.v3.p1.c2

topmed-PIMA_DS-ASTHMA-IRB-COL

No

Yes

NHLBI TOPMed: Characterizing the Response to a Leukotriene Receptor Antagonist and an Inhaled Corticosteroid (CLIC)

phs001729.v3.p1.c2

topmed-CARE_CLIC_DS-ASTHMA-IRB-COL

No

Yes

NHLBI TOPMed: Best ADd-on Therapy Giving Effective Response (BADGER)

phs001728.v3.p1.c2

topmed-CARE_BADGER_DS-ASTHMA-IRB-COL

No

Yes

Guiding Evidence Based Therapy Using Biomarker Intensified Treatment in Heart Failure (GUIDE-IT-BioLINCC)

phs003621.v1.p1.c1

BioLINCC-BL_GUIDE-IT_GRU

Yes

Yes

Heart Failure: A Controlled Trial Investigating Outcomes of Exercise Training (HF-ACTION-BioLINCC)

phs003599.v1.p1.c1

BioLINCC-BL_HF-ACTION_HMB

Yes

Yes

Heart Failure: A Controlled Trial Investigating Outcomes of Exercise Training (HF-ACTION-BioLINCC)

phs003599.v1.p1.c2

BioLINCC-BL_HF-ACTION_HMB-NPU

Yes

Yes

Sleep Heart Health Study (SHHS-BioLINCC)

phs003637.v1.p1.c1

BioLINCC-BL_SHHS_HMB-MDS

Yes

Yes

For detailed platform release notes please consult the following resources:

BDC-Gen3 release notes BDC-Terra release notes BDC-Seven Bridges release notes BDC-PIC-SURE release notes

Last updated