Blog

Forest Dussault, Software Engineer - Nov 01, 2022

Processing deep sequencing data with the Gencove platform

Overview

At Gencove, our goal is to provide an end-to-end solution which facilitates reliable sample processing, sequencing, analysis, and data delivery for our partners. By leveraging the power of low-pass sequencing and imputation through a sophisticated technical stack, we process high volumes of samples at a reduced cost for our customers every day.

While low-pass sequencing followed by imputation is a powerful tool, use cases such as de novo discovery of variants, or detection of known rare variants, require sequencing at higher depths of coverage. To serve this use case and to help move towards Gencove’s vision of ubiquitous sequencing, we’re happy to announce support for our deep whole genome sequencing (WGS) pipeline!

Gencove can facilitate a cost effective approach to identifying rare variants in your samples through the WGS pipeline. We are happy to accommodate existing high coverage FASTQ datasets (e.g. >20X coverage), or to provide the lab support needed for an end-to-end process.

Usage

Our web platform, CLI, and API make it easy to work with your data. Running the pipeline is as simple as creating a project on the Gencove system with the relevant WGS configuration. The example below shows this configuration for humans; we are able to support other species as well.

Next, assign your samples to the newly created project. This will automatically spin up the cloud resources required to run your samples through the deep sequencing pipeline.

In just a few hours, your results will be available on the platform and retrievable via the web platform or CLI.

Pipeline

The WGS pipeline calls variants against a reference genome. The pipeline will perform basic FASTQ validation, conduct alignment, call variants, and finally estimate a genome evenness metric, which represents how much of the total genome was captured by sequencing.

Outputs from the pipeline made available on the Gencove platform include an alignment file (BAM), filtered and unfiltered variant call files (VCF), and an estimate of genome evenness (TXT).

Access to the WGS pipeline is available upon request. If you’d like to know more or have suggestions, please reach out to us!