Blog

Caitlin M Stewart, Assay Development Scientist & Matthew Gibson, Senior Data Scientist - Feb 05, 2024

A comparison between low-cost library preparation kits for low coverage sequencing

As sequencing costs continue to drop, the upstream (library preparation) and downstream (data analysis & management) pieces of next-generation sequencing are becoming more important. The costs associated with library preparation have remained constant, so finding cost-saving modifications to this step has become increasingly important, especially at Gencove in our mission towards ubiquitous sequencing.

With increasing commercial options for library preparation kits available, choosing the correct kit for the job is critical. To address this methodological gap we have experimentally examined the performance of several library preparation kits in the context of low coverage whole genome sequencing (lcWGS) with imputation and built a resource for scientists choosing a library preparation kit for their next sequencing project

Our results show that the miniaturization of library prep protocols provides extensive cost savings without sacrificing imputation performance in lcWGS. Integrated DNA Technologies (IDT), Roche, and Illumina kits can be successfully miniaturized, and as the performance was approximately equivalent between kits, the decision of which to use will rely on whether full-length adapters are essential for PCR-free workflows, short time to prepare libraries is needed, or if you’re using long-read sequencing.

Our analysis provides a resource for scientists and decision-makers deciding between methods for lcWGS where cost is of importance.

Study Design

96 human samples were run through miniaturized versions of the IDT, Roche, and Illumina kits as well as the full sized IDT kit. Miniaturized versions of the Roche and Illumina kits have been compared to full sized versions previously1,2. Libraries were sequenced on an Illumina NextSeq2000, aligned to the human genome GRCh38, and imputed against a state-of-the-art human reference panel–HGDP1KG. The HGDP1KG panel is a combined resource incorporating both the New York Genome Center's 1KG panel (NYGC1KG) and the Human Genome Diversity Project (HGDP). It features 4,091 samples and covers 76.4 million variants. Leave-One-Out (LOO) concordance (a measure of similarity between the imputed and truth genotypes) was calculated, and library statistics, cost, and other factors were compared.

Findings

Miniaturization of the IDT kit was successful. All library prep methods showed high LOO concordance, although duplication rate was slightly higher for the IDT kits, this did not affect performance. Effective coverage–a metric describing the fraction of the reference panel covered by at least one low pass read–was slightly lower for the IDT miniaturized kit likely due to over fragmentation of the DNA, which can be solved easily by decreasing the fragmentation time.

Figure 1: Histograms of performance metrics across tested kits Illumina miniaturized, IDT, IDT miniaturized, and Roche miniaturized. Note: y-axes are not shared between panels.

In terms of lab operations, the fastest kit to complete was the Illumina miniaturized kit, although it required the most steps on the liquid handler (Agilent BRAVO) and thus more hands-on time. The IDT miniaturization saves >$15 per sample bringing the cost in line with other miniaturizied kits.

Kit

Time (Hours)

Cost per Sample

Agilent BRAVO steps

Roche mini

3

<$5

3

Illumina mini

2

<$5

5

IDT

3

>$20

N/A^

IDT mini

3

<$5

3

Table 1: Summary of lab operations considerations for each kit.

The Roche miniaturized kit and IDT kits are available to use with full length adapters (useful for PCR-free workflows), and adaptations to the IDT kits can be done to decrease fragmentation, if running on a long read sequencer is of interest.

Conclusions

If you want the lowest cost: Miniaturization significantly reduces costs across all library prep kits, with the study demonstrating an 83.3% reduction in reagent usage for the miniaturized IDT kit compared to full-sized reactions. This approach leverages smaller reaction volumes without compromising library quality or yield, providing a cost-effective solution for large-scale genomic studies.

If you want your library ready the fastest:
The Illumina kit stands out for its rapid turnaround, requiring approximately 2 hours from start to finish. This efficiency is attributed to its tagmentation-based process, which combines fragmentation and adapter ligation into a single step.

If you have a PCR-free workflow:
For those requiring PCR-free workflows, the Roche and IDT kits are appropriate for their compatibility with full-length adapters. This feature is particularly relevant for applications sensitive to PCR amplification bias or those that require direct sequencing of native DNA fragments, such as certain epigenetic analyses or studies focusing on DNA modifications.

If you're running on a long read sequencer: Adaptations to the IDT kits can reduce fragmentation, making them more suitable for long-read sequencing. This adaptation involves adjusting the fragmentation using a decelerator (available from IDT) to produce larger fragment sizes.

Ensuring you’re using the right library preparation kit is a critical step to consider when getting your sequencing project off the ground. The end-to-end Gencove platform for genetic data generation, analysis, and management allows us to work closely with our customers to streamline and accelerate the journey from populations and samples to actionable insights and solutions. This study offers the first systematic comparison of kits for low-pass sequencing, crucial for genomics research, and provides a reference for choosing between them.

Read more on our BiorXiv paper here.

References

1. Pillay, S. et al. Evaluation of miniaturized Illumina DNA preparation protocols for SARS-CoV-2 whole genome sequencing. PLoS One, (2023) 18(4): e0283219.

2. Li, Jeremiah H., Mazur, Chase A., Berisa, Tomaz, and Pickrell, Joseph K. Low-pass sequencing increases the power of GWAS and decreases measurement error of polygenic risk scores compared to genotyping arrays. Genome Research, (2021), 529-537, 31(4).