LogoLogo
  • NHLBI BioData Catalyst® (BDC) Documentation
  • Community
    • Who We Are
    • BDC Glossary
    • Citation and Acknowledgement
    • Strategic Planning
    • Request for Comments
      • NHLBI BioData Catalyst Ecosystem Security Statement
      • NHLBI DICOM Medical Image De-Identification Baseline Protocol
    • BDC Video Content Guidance
    • Contributing User Resources to BDC
  • Written Documentation
    • Getting Started
    • Data Access
      • Data Interoperability
      • Understanding Access
      • Submitting a dbGaP Data Access Request
      • Checking Access
    • Explore Available Data
      • Dug Semantic Search
        • Search and Results
      • PIC-SURE User Guide
        • Getting Started
          • Requirements and Login
          • Available Data and Managing Data Access
            • TOPMed and TOPMed related datasets
            • BioLINCC Datasets
            • CONNECTS Dataset
        • Data Organization in PIC-SURE
        • PIC-SURE Features and General Layout
        • PIC-SURE Open Access vs. PIC-SURE Authorized Access
          • PIC-SURE Open Access
          • PIC-SURE Authorized Access
        • Data Analysis Using the PIC-SURE API
        • Additional Resources
        • PIC-SURE API Documentation
        • Appendix 1: BioData Catalyst Identifiers - dbGaP, TOPMed, and PIC-SURE
        • Appendix 2: Table of Harmonized Variables
      • Discovering Data Using Gen3
        • Dictionary
        • Exploration
        • Query
        • Workspace
        • Profile
        • PFB Files
        • Current Projects
    • Analyze Data
      • Transferring Files Between Seven Bridges and Terra
      • Seven Bridges
        • Knowledge Center
        • Getting Started Guide
        • Comprehensive Analysis Tips
        • Troubleshooting Tasks
        • GWAS with GENESIS workflows
        • Annotation Explorer
      • Terra
        • Account Setup
          • Billing
          • Managing Costs
        • Workspace Setup
          • Data Storage & Management
          • Collaboration
          • Security
        • Bring Data into a Workspace
          • Bring in Data from Gen3
          • From Terra’s Data Library
          • Use Your Own Data with Terra
        • Run Analyses
          • Batch Processing with Workflows
          • Interactive Analysis
          • Genome-Wide Association Studies
        • Troubleshooting & Support
      • Dockstore
        • Launch workflows with BioData Catalyst
        • Discover our catalog
        • Intro to Docker, WDL, CWL
        • Dockstore Forum
        • Contribute to the community
    • Community Tools & Integration
      • Bring Your Own Tool(s)
        • BYOT Glossary
        • Working with Docker
        • Creating, testing & scaling WDL workflows
        • Creating, testing & scaling CWL workflows
        • Version Control, Publishing & Validation of Workflows
        • Advanced Topics
      • Import a Dockstore App With Seven Bridges
    • Writing BDC into a Grant Proposal
    • Incurring Cloud Costs
    • Release Notes
      • 2025-04-15 BDC Release Notes
      • 2025-01-15 NHLBI BioData Catalyst Ecosystem Release Notes
      • 2024-10-21 NHLBI BioData Catalyst Ecosystem Release Notes
      • 2024-07-02 NHLBI BioData Catalyst Ecosystem Release Notes
      • 2024-04-01 NHLBI BioData Catalyst Ecosystem Release Notes
      • 2024-01-08 NHLBI BioData Catalyst Ecosystem Release Notes
      • 2023-10-04 NHLBI BioData Catalyst Ecosystem Release Notes
      • 2023-07-11 NHLBI BioData Catalyst Ecosystem Release Notes
      • 2023-04-04 BioData Catalyst Ecosystem Release Notes
      • 2023-01-09 BioData Catalyst Ecosystem Release Notes
      • 2022-10-03 BioData Catalyst Ecosystem Release Notes
      • 2022-07-11 BioData Catalyst Ecosystem Release Notes
      • 2022-04-04 BioData Catalyst Ecosystem Release Notes
      • 2022-01-24 BioData Catalyst Ecosystem Release Notes
      • 2021-10-04 BioData Catalyst Ecosystem Release Notes
      • 2021-07-09 BioData Catalyst Ecosystem Release Notes
      • 2021-04-02 BioData Catalyst Ecosystem Release Notes
      • 2021-01-15 BioData Catalyst Ecosystem Release Notes
      • 2020-10-23 BioData Catalyst Ecosystem Release Notes
      • 2020-08-24 BioData Catalyst Ecosystem Release Notes
      • 2020-04-02 BioData Catalyst Ecosystem Release Notes
    • Data Versioning Release Notes
    • NIH RECOVER Release Notes
  • Tutorials: Videos & Modules
    • Seven Bridges Tutorials
      • Genetic Association Testing using GENESIS Workflows
      • Estimating and Managing Your Cloud Costs
    • Terra Tutorials
      • Getting Started with Gen3 Data on Terra Tutorial
      • Genome Wide Association Study with 1000 Genomes Data Tutorial
      • Genome Wide Association Study with TOPMed Data Tutorial
      • TOPMed Aligner, or, How to Import Data From Gen3 into Terra and Run a Workflow on It
  • Data Management
    • Data Management Strategy
    • Instructions for Data Submission to BDC
      • De-identification Readme
      • Data Dictionary Requirement
    • dbGaP Study Configuration Process for Submission of Data to BDC
Powered by GitBook
On this page
  • Introduction
  • Significant new features
  • New user support materials and documentation
  • Data Releases

Was this helpful?

Export as PDF
  1. Written Documentation
  2. Release Notes

2021-04-02 BioData Catalyst Ecosystem Release Notes

Previous2021-07-09 BioData Catalyst Ecosystem Release NotesNext2021-01-15 BioData Catalyst Ecosystem Release Notes

Last updated 3 years ago

Was this helpful?

Introduction

The 2021-04-02 release marks the fifth release for the NHLBI BioData Catalyst ecosystem. This release includes several new features (e.g., CWL tools for QC pipelines) along with documentation and tutorials to help new users get started on the system. This release also includes enhanced support for searching across documentation. Please find more detail on the new features and user support materials in the sections below.

The 2021-04-02 data release includes updates of CRAMs and unharmonized clinical files for 6 TOPMed studies previously hosted on BioData Catalyst. For each study and consent group, VCF files are available on a per chromosome basis and in an un-tarred format.

Please refer to the Data Release section below for more information as well as the on the BioData Catalyst website.

Significant new features

Documentation Search: BioData Catalyst users can now use to search across various types of documentation over the entire ecosystem. Favorite results can be saved in a folder and revisited later.

CWL tools for QC pipelines: Users can now find the following CWL tools for quality control of GWAS data in the:

  • - This UW-GAC tool calculates heterozygosity by sample.

  • - This UW-GAC workflow checks expected relationships specified in a pedigree file against empirical kinship values from KING or PC-Relate.

Import files from Kids First Data Resource Center: Users can now access datasets from the Kids First Data Resource portal directly from BioData Catalyst Powered by Seven Bridges using DRS links. Users must have dbGaP approvals for the Kids First datasets in order to access the dataset on BioData Catalyst. In addition, users can import DRS links from open access datasets available via DRS servers.

PIC-SURE Data Access Dashboard: Users on PIC-SURE can now see a list of studies with data available in PIC-SURE. The Data Access Dashboard will show the study name, identifier, and the number of variables/samples present. Additionally the user can see if they have access to the study or click to a link where they can learn more about the study and request access to studies they are not yet authorized to use.

Query annotations for all SNVs and dbSNP INDELS in the Annotation Explorer: Users on Seven Bridges can now use the Annotation Explorer to interactively aggregate and filter all SNVs (over 8 billion variants) and publicly available INDELs from dbSNP using ~700 annotations. Variant grouping files can be created from the results and exported to a workspace for use in rare variant association testing. This database is available to all authenticated users of BioData Catalyst. for more information about how to use the Annotation Explorer.

New user support materials and documentation

were published on Dockstore to help users developing containers and descriptor files for their bioinformatics pipelines.

The has been revamped.

Launched a 3-part video tutorial series on workflows, which helps users, particularly Fellows that are new to the platform, gain more insight into how to best utilize workflows for their data analysis.

Data Releases

The table below highlights which studies were included in the 2021-04-02 data release. CRAMs and unharmonized clinical files were updated for 6 TOPMed studies previously hosted on BioData Catalyst. For each study and consent group, VCF files are available on a per chromosome basis and in an un-tarred format. The data is now available for access across the entire ecosystem.

Study Name

phs I.D. #

Acronym

New to BioData Catalyst

New study version

Framingham Cohort

phs000007

FHS

False

30

Genetic Epidemiology Network of Salt Sensitivity (GenSalt)

phs000784

GenSalt

False

3

Atherosclerosis Risk in Communities (ARIC) Cohort

phs000280

ARIC

False

7

Genes-Environments and Admixture in Latino Asthmatics (GALA II) Study

phs001180

GALAII

False

2

Cardiovascular Health Study (CHS) Cohort

phs000287

CHS

False

7

Women's Health Initiative Clinical Trial and Observational Study

phs000200

WHI

False

12

For detailed platform release notes please consult the following resources:

  • Gen3 release notes

  • PIC-SURE release notes

Created and published all materials from the Fellows onboarding session including a recording of the session, all materials used, instructions, etc. Additional webinars will be posted in the coming weeks.

- How to run a pre-configured workflow

- How to configure and run a workflow from scratch

- How to run downstream analysis (on the data that resulted from your workflow)

Published a clearly walking researchers through how they can leverage free cloud credits from Google Cloud in Terra. Published a related to the aforementioned one on free cloud credits through GCP. This post covers additional funding sources for covering researchers’ cloud costs, highlighting Google EDU providing up to $10,000 in coupons for supported research projects and the NIH STRIDES initiative. Further, Terra added a covering how the call caching feature in Cromwell can help users save time and money.

Started a new blog post series focused on highlighting papers that may be of interest to the BDCatalyst community. covers a about workflow systems from C. Titus Brown’s lab at UC Davis.

Published a officially announcing that RStudio is available in Terra, and this includes a for getting up and running.

Uploaded a demonstrating the use of Terra for viral genomics by guiding the user through the COVID-19 workspace.

Published a introducing a new feature for task-level checkpointing in workflows. This makes it possible to save intermediate outputs for a task and resume work from that point if the task gets interrupted. Full documentation of this checkpoint feature can be found .

Uploaded a proposing a cross-domain, common data model built specifically to facilitate search and reuse.

Data page
Documentation Search
Seven Bridges Public Apps Gallery
Heterozygosity by sample
Pedigree Check
See here
Best Practices for Secure and FAIR workflows
Gitbook guide to self-service onboarding to Terra
Webinar 1
Part 1
Part 2
Part 3
blog post
blog post
new support documentation article
This first post
review paper
blog post
new video tutorial
new video tutorial
blog post
here
video on Broad’s BioIT 2020 Talk
Terra release notes
Seven Bridges release notes
Dockstore release notes