Tomaz Berisa, CTO - Oct 22, 2020

Sample metadata

Users can now assign arbitrary JSON-formatted metadata to Gencove samples for downstream use.

We’ve been receiving consistent feedback from users saying they would like to store additional non-genomic data together with genomic data for their Gencove samples. These requests can be grouped into two broad clusters according to the nature of the data being stored:

  • Phenotypic data about the individual represented by the sample
  • Technical processing data
    • Batch information
    • Alternative or auxiliary identifiers

A commonality to these requests was also being able to search samples using sample metadata.

A fundamental question of genetics: "How does the genome (G) influence phenotypes (P)?"

Since a fundamental question of genetics is "How does the genome influence phenotypes", we were keenly aware that connecting genomic data with non-genomic data is crucial and we wanted to get this feature right. The solution needed to be flexible enough to support diverse data types for a wide variety of use-cases, but simple enough to maintain utility.

We settled on JSON (JavaScript Object Notation) as the data format. It is a widespread and open data interchange format that supports common data types. The first Google result for JSON describes it best.

JSON can be as simple as a string "string" or number 123, but it shines with structured data like dictionaries (objects in JSON terminology) and lists.

The Gencove platform now enables users to easily assign, retrieve, and search sample metadata with the newest version of the Gencove CLI (v2.0.26), while updates to the Gencove web dashboard are on the way.

Assign, retrieve, and search sample metadata on the Gencove platform

We’re quite excited about work on metadata-related downstream applications so stay tuned and, as always, let us know what you think.