Blog

Jeremy Li, Director of Data Science - Jul 26, 2023

Gencove platform update offers phased genotypes, improved imputation of sex chromosomes and faster processing times

Staying at the cutting-edge of the latest bioinformatic tooling is critical to navigating the dynamic landscape of genomics. We are pleased to announce several new optimizations and improvements to our standard pipeline offerings that will reduce processing time and resources.

Phased Genotypes to Support Haplotypic Analyses

With this update, we now deliver phased genotypes as a matter of course. This unlocks downstream analyses which require haplotypic (rather than just genotypic) information without requiring an additional step of statistical phasing to be performed on the genotypes we deliver.

Improved Imputation of Sex Chromosomes

In addition, we have slowly been rolling out a number of significant improvements to our human imputation pipelines, including ploidy-aware imputation and the inclusion of the Y and mitochondrial genomes to our human pipelines on both builds 37 and 38 of the human reference genome.

The implementation of ploidy-aware imputation means that the inferred genetic sex of a sample, which we always compute in our human pipelines, is used to dictate the ploidy at which the sex chromosomes are imputed, so that haploid X and Y genotypes are imputed for inferred males, and diploid X and missing Y genotypes are imputed for inferred females. This simplifies downstream analysis by automatically delivering the variant calls based on the correct ploidy implied by the individual’s sex.

Underlying Algorithm Update for Speed Improvements

Recently, GLIMPSE2 was released, an update to the GLIMPSE1 imputation algorithm that we regularly use for genotype imputation from low pass sequence data. This update boasts an impressive speedup in terms of runtime thanks to a number of optimizations. As such, we have now incorporated GLIMPSE2 as the default algorithm for our imputation pipelines, allowing our users to expect even faster turnaround times for sample processing.

These improvements significantly reduce the amount of work users of the Gencove platform need to do to prepare the imputed calls for downstream analyses, while also significantly speeding up the time to result. The updates are now implemented and available on Gencove’s platform as publicly available pipelines called “Human low-pass GRCh37 v4.0” and “Human low-pass GRCh38 v4.1” for builds 37 and 38 of the reference genome respectively.