The 2021-01-15 release marks the fourth release for the NHLBI BioData Catalyst ecosystem. This release includes several new features (e.g., CWL workflows to create dataset specific files needed for GWAS) along with documentation and tutorials to help new users get started on the system. This release also includes enhanced support for CWL tools for post-GWAS analysis and a CWL tool for Bcftools Merge and Filter. Please find more detail on the new features and user support materials in the sections below.
The 2021-01-15 data release includes the addition of both TOPMed studies and the ORCHID Study, conducted by the (PETAL) Clinical Trials Network of NHLBI. Multi-sample VCFs, CRAMs and unharmonized clinical files were added for 27 TOPMed studies new to BioData Catalyst. Additionally, 7 TOPMed studies previously hosted on BioData Catalyst were updated to the latest study versions. These updates include new CRAMs, unharmonized clinical files and multi-sample VCFs for Freeze 8. For each study and consent group, VCF files are available on a per chromosome basis and in an un-tarred format. The associated clinical files were added for the ORCHID study.
Please refer to the Data Release section below for more information as well as the Data page on the BioData Catalyst website.
CWL workflows to create dataset specific files needed for GWAS: Users can now find the following CWL workflows for creating dataset specific files needed for GWAS in the Seven Bridges Public Apps Gallery:
LD Pruning - Filter variants based on linkage disequilibrium measures
KING robust and KING IBDseg - Estimate kinship coefficients
PC-AiR - Perform principal components analysis
PC-Relate - Estimate genetic relatedness
CWL tools for post-GWAS analysis: Users can now find the following CWL tools for post-GWAS analysis in the Seven Bridges Public Apps Gallery:
SBG Loci Snapshoter - Generate screenshots of specific regions of aligned files provided as inputs
LocusZoom - Standalone tool for generating static locus zoom plots. Users can make annotated Manhattan plots on specific regions from association files generated with the GENESIS association workflows.
CWL tool for Bcftools Merge and Filter: Users can now find a CWL tool for BCFtools Merge and Filter in the Seven Bridges Public Apps Gallery. This tool merges multiple VCF/BCF files from non-overlapping sample sets to create one multi-sample file and filter out any monomorphic variants. This tool is useful when working with input files that contain monomorphic variants like the TOPMed datasets.
Genetic Association Testing Using the GENESIS Workflows tutorial: Seven Bridges updated this tutorial to show how to perform an association test using the GENESIS workflows using TOPMed Freeze 8 multi-sample VCF data. Previous versions of this tutorial used TOPMed Freeze 5 data. Version 1.1 of this tutorial can be downloaded as a PDF from the Tutorials page of the BioData Catalyst GitBook.
ORCHID Clinical Trial Statistical Analysis Reproduction: NHLBI BioData Catalyst has made data available to authorized investigators for the study titled: PETAL Network: Outcomes Related to COVID-19 Treated With Hydroxychloroquine Among Inpatients With Symptomatic Disease (ORCHID) Trial, phs002299.v1.p1. This is based on the multi-center, double blinded, randomized clinical trial conducted to assess the efficacy of hydroxychloroquine in the treatment of COVID-19. Results were published in JAMA on November 9th, 2020 (paper available here). This notebook enables anybody with authorized credentials to reproduce the ORCHID clinical trial results by showing how to 1) Access the data using the PIC-SURE API and 2) Reproduce the results of this study using the open-source R programming language. Available in Seven Bridges Public Project, under PIC-SURE API or through PIC-SURE GitHub.
The table below highlights which studies were included in the 2021-01-15 data release which includes both TOPMed studies and The Outcomes Related to COVID-19 treated with hydroxychloroquine among In-patients with symptomatic Disease study, or ORCHID Study, conducted by the (PETAL) Clinical Trials Network of NHLBI. Multi-sample VCFs, CRAMs and unharmonized clinical files were added for 27 TOPMed studies new to BioData Catalyst. Additionally, 7 TOPMed studies previously hosted on BioData Catalyst were updated to the latest study versions. These updates included new CRAMs, unharmonized clinical files and multi-sample VCFs for Freeze 8. For each study and consent group, VCF files are available on a per chromosome basis and in an un-tarred format. The associated clinical files were added for the ORCHID study. The data is now available for access across the entire ecosystem.
Gen3 release notes
PIC-SURE release notes
Study Name
phs I.D. #
Acronym
New to BioData Catalyst
New study version
NHLBI TOPMed: Genome-wide Association Study of Adiposity in Samoans
phs000972
SAS
NHLBI TOPMed: The Genetics and Epidemiology of Asthma in Barbados
phs001143
BAGS
Yes
NHLBI TOPMed: Rare Variants for Hypertension in Taiwan Chinese (THRV)
phs001387
THRV
Yes
NHBLI TOPMed: Pharmacogenomics of Hydroxyurea in Sickle Cell Disease (PharmHU)
phs001466
pharmHU
Yes
NHLBI TOPMed: Study of Asthma Phenotypes and Pharmacogenomic Interactions by Race-Ethnicity (SAPPHIRE)
phs001467
SAPPHIRE_asthma
Yes
NHLBI TOPMed: MyLifeOurFuture (MLOF) Hemophilia Study
phs001515
MLOF
Yes
NHLBI TOPMed: Diabetes Heart Study (DHS) African American Coronary Artery Calcification (AA CAC)
phs001412
AACAC
Yes
NHLBI TOPMed: Novel Risk Factors for the Development of Atrial Fibrillation in Women
phs001040
WGHS
Yes
NHLBI TOPMed: The Vanderbilt Atrial Fibrillation Registry (VU_AF)
phs001032
VU_AF
NHLBI TOPMed: The Genetic Epidemiology of Asthma in Costa Rica
phs000988
CRA
Yes
NHLBI TOPMed - NHGRI CCDG: MGH Atrial Fibrillation Study
phs001062
MGH_AF
Yes
NHLBI TOPMed: Australian Familial Atrial Fibrillation Study
phs001435
AustralianFamilialAF
Yes
NHLBI TOPMed: African American Sarcoidosis Genetics Resource
phs001207
Sarcoidosis
Yes
NHLBI TOPMed: CHS Gene-Air Pollution Interactions in Asthma (GAP)
phs001602
ChildrensHS_GAP
Yes
NHLBI TOPMed: CHS (Effects of Air Pollution on the Development of Obesity in Children)
phs001604
ChildrensHS_MetaAir
Yes
NHLBI TOPMed - NHGRI CCDG: AFLMU
phs001543
AFLMU
Yes
NHLBI TOPMed - NHGRI CCDG: Malmo Preventive Project (MPP)
phs001544
MPP
Yes
NHLBI TOPMed - NHGRI CCDG: Intermountain INSPIRE Registry
phs001545
INSPIRE_AF
Yes
NHLBI TOPMed: Texas Cardiac Arrhythmia Institute - DECAF Study
phs001546
DECAF
Yes
NHLBI TOPMed: Early-onset Atrial Fibrillation in the Estonian Biobank
phs001606
EGCUT
Yes
NHLBI TOPMed: CHS Integrative Genomics and Environmental Research of Asthma (IGERA)
phs001603
ChildrensHS_IGERA
Yes
NHLBI TOPMed: Pulmonary Fibrosis Whole Genome Sequencing
phs001607
IPF
Yes
NHLBI TOPMed - NHGRI CCDG: The GENetics in Atrial Fibrillation (GENAF) Study
phs001547
GENAF
Yes
NHLBI TOPMed: Pulmonary Fibrosis Whole Genome Sequencing
phs001607
IPF
Yes
NHLBI TOPMed: Chicago Initiative to Raise Asthma Health Equity (CHIRAH)
phs001605
CHIRAH
Yes
NHLBI TOPMed: Pulmonary Fibrosis Whole Genome Sequencing
phs001607
IPF
Yes
NHLBI TOPMed: Outcome Modifying Genes in Sickle Cell Disease (OMG)
phs001608
OMG_SCD
Yes
NHLBI TOPMed - NHGRI CCDG: Vanderbilt University BioVU Atrial Fibrillation Genetics Study
phs001624
BioVU_AF
Yes
NHLBI TOPMed: Lung Tissue Research Consortium (LTRC)
phs001662
LTRC
Yes
NHLBI TOPMed CCDG: Groningen Atrial Fibrillation (GGAF) Study
phs001725
GGAF
Yes
NHLBI TOPMed: Pathways to Immunologically Mediated Asthma (PIMA)
phs001727
PIMA
Yes
NHLBI TOPMed: Best ADd-on Therapy Giving Effective Response (BADGER)
phs001728
CARE_BADGER
Yes
NHLBI TOPMed: Characterizing the Response to a Leukotriene Receptor Antagonist and an Inhaled Corticosteroid (CLIC)
phs001729
CARE_CLIC
Yes
NHLBI TOPMed: Pediatric Asthma Controller Trial (PACT)
phs001730
CARE_PACT
Yes
NHLBI TOPMed: TReating Children to Prevent EXacerbations of Asthma (TREXA)
phs001732
CARE_TREXA
Yes
PETAL Network: Outcomes Related to COVID-19 Treated With Hydroxychloroquine Among Inpatients With Symptomatic Disease (ORCHID) Trial
phs002299
ORCHID
Yes