# Discovering Data Using Gen3

## Login to the *BDC-Gen3* Platform <a href="#login-to-nhlbi-biodata-catalyst-gen3" id="login-to-nhlbi-biodata-catalyst-gen3"></a>

In order to navigate and access data available on the Gen3 platform, start by visiting the [login page](https://gen3.biodatacatalyst.nhlbi.nih.gov/login). You will need an eRA Commons account as well as access permissions through the [Database of Genotypes and Phenotypes (dbGaP)](https://www.ncbi.nlm.nih.gov/gap/). If you are a researcher, login by selecting **NIH Login** and using your [eRA Commons account](https://public.era.nih.gov/commons/public/login.do). BDC consortia developers can login using their Google accounts. Make sure to use the correct login method that contains access to your available projects.

![](/files/-M_RTIr5tYe8eHlCCCYx)

Once logged in, your username will appear in the upper right-hand corner of the page. You will also see a display with aggregate statistics for the total number of subjects, studies, aliquots and files available within the BDC platform.

> **NOTE**: These numbers may differ from those displayed in the dbGaP records as they include TOPMed studies as well as the associated parent studies.&#x20;

![Post-login view of the BDC-Gen3 front page.](/files/-MN_W3soMQMCqDxp24t3)

## Types of Hosted Data <a href="#types-of-hosted-data" id="types-of-hosted-data"></a>

### Phenotypic <a href="#phenotypic" id="phenotypic"></a>

#### DCC Harmonized clinical data:  <a href="#dcc-harmonized-clinical-data" id="dcc-harmonized-clinical-data"></a>

A number of clinical variables have been harmonized by the [Data Coordinating Center (DCC)](https://www.nhlbiwgs.org/group/dcc) in order to facilitate cross-study analysis. Faceted search over the DCC Harmonized Variables is available via the [Exploration](/biodata-catalyst-documentation/written-documentation/explore-available-data/gen3-discovering-data/exploration.md) page, under the "Data" tab.

#### Unharmonized clinical data:  <a href="#unharmonized-clinical-data" id="unharmonized-clinical-data"></a>

Unharmonized clinical files are also available on the Gen3 platform and contain all of the raw phenotypic information for the hosted studies. Unlike the DCC Harmonized Variables, these files are located and searchable under the "[Files](/biodata-catalyst-documentation/written-documentation/explore-available-data/gen3-discovering-data/exploration.md#the-files-tab)" tab in the [Exploration](/biodata-catalyst-documentation/written-documentation/explore-available-data/gen3-discovering-data/exploration.md) page.

### Genomic <a href="#genomic" id="genomic"></a>

The Gen3 platform hosts genomic data provided by the [Trans-Omics for Precision Medicine](https://www.nhlbiwgs.org/) (TOPMed) program and the [1000 Genomes Project](https://www.internationalgenome.org/) plus synthetic tutorial data from Terra. At present, these projects include CRAM and VCF files together with their respective index files. Specifically for TOPMed projects, each project will contain at least one multi-sample VCF that comprises all subjects within the consent group. CRAM and VCF are based on an individual level, whereas multi-sample VCFs are based on the study consent level.

All files are available under the "Files" tab in the [Exploration](/biodata-catalyst-documentation/written-documentation/explore-available-data/gen3-discovering-data/exploration.md#the-files-tab) page. More detailed information on currently hosted data on the Gen3 platform can be found [here](/biodata-catalyst-documentation/written-documentation/explore-available-data/gen3-discovering-data/parent-study-versus-topmed-study.md).

## Gen3 Pages <a href="#gen3-pages" id="gen3-pages"></a>

The *BDC-Gen3* platform contains five pages described below:

* [**Dictionary**](/biodata-catalyst-documentation/written-documentation/explore-available-data/gen3-discovering-data/dictionary.md)**:** An interactive data dictionary display that details the contents and relationships between clinical and biospecimen data
* [**Exploration**](/biodata-catalyst-documentation/written-documentation/explore-available-data/gen3-discovering-data/exploration.md)**:** The facet filter custom cohort creation tool
* [**Query**](/biodata-catalyst-documentation/written-documentation/explore-available-data/gen3-discovering-data/query.md)**:** The GraphQL query tool to retrieve specific data within the graph model
* [**Workspace**](/biodata-catalyst-documentation/written-documentation/explore-available-data/gen3-discovering-data/workspace.md)**:** The launch page for Gen3 workspaces that includes Jupyter Notebooks and RStudio
* [**Profile**](/biodata-catalyst-documentation/written-documentation/explore-available-data/gen3-discovering-data/profile.md)**:** The information page for each user, displaying access and the location for credential file downloads

![The BDC-Gen3 Pages.](/files/-M1R96b3LGl4RkECCXWt)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://bdcatalyst.gitbook.io/biodata-catalyst-documentation/written-documentation/explore-available-data/gen3-discovering-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
