This forum is a great place to find and post questions about Docker files, workflow languages, Dockstore features, and workflow learning resources. The user base includes CWL, WDL, Nextflow, and Galaxy workflow authors and users.
How to use Dockstore workflows in our cloud partner platforms
Using the NHLBI BioData Catalyst ecosystem, you can launch workflows from Dockstore in both of our partner analysis platforms, Terra and SevenBridges. It is important to know that these platforms use different workflow languages: Terra uses WDL and SevenBridges uses CWL.
When you open any WDL and CWL workflow in Dockstore, you will see the option to "Launch with NHLBI BioData Catalyst":
If you selected a CWL workflow, this workflow will launch in BioData Catalyst Powered by SevenBridges.
If you selected a WDL workflow, this workflow will launch in BioData Catalyst Powered by Terra. .
Technologies for reproducible analysis in the cloud
Docker is a fantastic tool for creating light-weight containers to run your tools. It gives you a fast, VM-like environment for Linux where you can automatically install dependencies, make configurations, and setup your tool exactly the way you want, just as you would on a “normal” Linux host. You can then quickly and easily share these Docker images with the world using registries like Quay.io (indexed by Dockstore), Docker Hub, and GitLab.
Learn how to create a Docker image
There are multiple workflow languages currently available to use with docker technology. In the BioData Catalyst ecosystem, SevenBridges uses CWL and Terra uses WDL. To learn more about how these language compare and differ, read Dockstore's documentation on tools and workflows.
Once you have picked what language works best for you, prepare your pipeline for analysis in the cloud with these tutorials aimed at bioinformaticians:
Learn how to create a tool in Common Workflow Language (CWL)
Learn how to create a tool in Workflow Descriptor Language (WDL)
Dockstore’s integration with BioData Catalyst allows researchers the ability to easily launch reproducible tools and workflows in secure workspace environments for use with sensitive data. This privilege to work with sensitive data requires assurances of safe software.
We believe we can enhance the security and reliability of tools and workflows through open, community-driven best practices that exemplify the FAIR (Findable, Accessible, Interoperable, Reusable) guiding principles. We have established a best practices framework for secure and FAIR workflows published in Dockstore. We ask that users try to implement these practices for all workflows they develop.
Dockstore offers faceted search, which allows for flexible querying of tools and workflows. Tabs are used to split up the results between tools and workflows. You can search for basic terms/phrases, filter using facets (like CWL vs WDL), and also use advanced search queries. Learn more.
You can also search curated workflows in Dockstore's page.
Organizations are landing pages for collaborations, institutions, consortiums, companies, etc. that allow users to showcase tools and workflows. This is achieved through the creation of collections, which are groupings of related tools and workflows. Learn more about , including how your research group can create your own organization to share your work with the community.
Dockstore Organizations relevant to BioData Catalyst users:
Here, you can find a suite of analysis tools we have developed with researchers that are aimed at the BioData Catalyst community. Examples include workflows for performing GWAS and Structural Variant Calling. Many of these collections also point users to tutorials where you can launch these workflows in our partner platforms and run an analysis.
These workflows are based on pipelines the University of Michigan developed to perform alignment and variant calling on TOPMed data. If you're bringing your own data to BioData Catalyst to compare with TOPMed data, these may be helpful resources.
"An app store for bioinformatics workflows"
Dockstore is an open platform used by the GA4GH for sharing Docker-based tools described with either the Common Workflow Language (CWL), the Workflow Description Language (WDL), or Nextflow (NFL). Dockerized workflows come packaged with all of their requirements, meaning you spend less time searching the web for obscure installation errors and more time doing research.
Dockstore is aimed at scientific use cases, and we hope this helps users find helpful resources more quickly. Our documentation is also created with researchers in mind: we work to distill down information about the technologies we use to the relevant points to get users started quickly.
This section highlights the documentation relevant to BioData Catalyst users. If you are brand new to Dockstore, it is suggested to review the Getting Started Guide. Our entire suite of documentation is available here.
Our mission is to catalyze open, reproducible research in the cloud
We hope Dockstore provides a reference implementation for tool sharing in the sciences. Dockstore is essentially a living and evolving proof of concept designed as a starting point for two activities that we hope will result in community standards within the GA4GH:
a best practices guide for describing tools in Docker containers with CWL/WDL/Nextflow
a minimal web service standard for registering, searching and describing CWL/WDL-annotated Docker containers that can be federated and indexed by multiple websites
We plan on expanding the Dockstore in several ways over the coming months. Please see our issues page for details and discussions.
To help Dockstore grow, we encourage users to publish their tools and workflows on Dockstore so that they can be used by the greater scientific community. Here is how to get started:
Register your tool or workflow on Dockstore
Create an Organization, invite your collaborators, and promote your work in collections