2021-01-15 BioData Catalyst Ecosystem Release Notes
Introduction
The 2021-01-15 release marks the fourth release for the NHLBI BioData Catalyst ecosystem. This release includes several new features (e.g., CWL workflows to create dataset specific files needed for GWAS) along with documentation and tutorials to help new users get started on the system. This release also includes enhanced support for CWL tools for post-GWAS analysis and a CWL tool for Bcftools Merge and Filter. Please find more detail on the new features and user support materials in the sections below.
The 2021-01-15 data release includes the addition of both TOPMed studies and the ORCHID Study, conducted by the (PETAL) Clinical Trials Network of NHLBI. Multi-sample VCFs, CRAMs and unharmonized clinical files were added for 27 TOPMed studies new to BioData Catalyst. Additionally, 7 TOPMed studies previously hosted on BioData Catalyst were updated to the latest study versions. These updates include new CRAMs, unharmonized clinical files and multi-sample VCFs for Freeze 8. For each study and consent group, VCF files are available on a per chromosome basis and in an un-tarred format. The associated clinical files were added for the ORCHID study.
Please refer to the Data Release section below for more information as well as the Data page on the BioData Catalyst website.
Significant new features
CWL workflows to create dataset specific files needed for GWAS: Users can now find the following CWL workflows for creating dataset specific files needed for GWAS in the Seven Bridges Public Apps Gallery:
LD Pruning - Filter variants based on linkage disequilibrium measures
KING robust and KING IBDseg - Estimate kinship coefficients
PC-AiR - Perform principal components analysis
PC-Relate - Estimate genetic relatedness
CWL tools for post-GWAS analysis: Users can now find the following CWL tools for post-GWAS analysis in the Seven Bridges Public Apps Gallery:
SBG Loci Snapshoter - Generate screenshots of specific regions of aligned files provided as inputs
LocusZoom - Standalone tool for generating static locus zoom plots. Users can make annotated Manhattan plots on specific regions from association files generated with the GENESIS association workflows.
CWL tool for Bcftools Merge and Filter: Users can now find a CWL tool for BCFtools Merge and Filter in the Seven Bridges Public Apps Gallery. This tool merges multiple VCF/BCF files from non-overlapping sample sets to create one multi-sample file and filter out any monomorphic variants. This tool is useful when working with input files that contain monomorphic variants like the TOPMed datasets.
New user support materials and documentation
Genetic Association Testing Using the GENESIS Workflows tutorial: Seven Bridges updated this tutorial to show how to perform an association test using the GENESIS workflows using TOPMed Freeze 8 multi-sample VCF data. Previous versions of this tutorial used TOPMed Freeze 5 data. Version 1.1 of this tutorial can be downloaded as a PDF from the Tutorials page of the BioData Catalyst GitBook.
ORCHID Clinical Trial Statistical Analysis Reproduction: NHLBI BioData Catalyst has made data available to authorized investigators for the study titled: PETAL Network: Outcomes Related to COVID-19 Treated With Hydroxychloroquine Among Inpatients With Symptomatic Disease (ORCHID) Trial, phs002299.v1.p1. This is based on the multi-center, double blinded, randomized clinical trial conducted to assess the efficacy of hydroxychloroquine in the treatment of COVID-19. Results were published in JAMA on November 9th, 2020 (paper available here). This notebook enables anybody with authorized credentials to reproduce the ORCHID clinical trial results by showing how to 1) Access the data using the PIC-SURE API and 2) Reproduce the results of this study using the open-source R programming language. Available in Seven Bridges Public Project, under PIC-SURE API or through PIC-SURE GitHub.
Data release
The table below highlights which studies were included in the 2021-01-15 data release which includes both TOPMed studies and The Outcomes Related to COVID-19 treated with hydroxychloroquine among In-patients with symptomatic Disease study, or ORCHID Study, conducted by the (PETAL) Clinical Trials Network of NHLBI. Multi-sample VCFs, CRAMs and unharmonized clinical files were added for 27 TOPMed studies new to BioData Catalyst. Additionally, 7 TOPMed studies previously hosted on BioData Catalyst were updated to the latest study versions. These updates included new CRAMs, unharmonized clinical files and multi-sample VCFs for Freeze 8. For each study and consent group, VCF files are available on a per chromosome basis and in an un-tarred format. The associated clinical files were added for the ORCHID study. The data is now available for access across the entire ecosystem.
For detailed platform release notes please consult the following resources:
Gen3 release notes
PIC-SURE release notes
Last updated