Global initiative aims to identify disease-linked genome variations
A newly formed international collaborative science project created with the aim of identifying variations in the human genome linked with disease is calling for dataset contributions.
The Consortium for Long Read Sequencingโs (CoLoRS) goal is to create a publicly-available database of long-read genome sequences of human genomes.ย
Work on entries for the database is expected to start this year and the consortium has invited on investigators with โraw or summary levelโ human genome datasets to contribute.
CoLoRS describes itself as an open coalition of international researchers focused on creating a comprehensive databaseย of frequency information for all classes of human variation identified using long-read human whole-genome sequencing.
Long-read sequencing accesses regions of the genome inaccessible to other technologies and is capable of detecting up to 15,000 more structural variants and 300,000 more small variants.ย
CoLoRS plans to complement existing databases, help improve the discovery of pathogenic variation, and advance the understanding of rare disease, for which more than half of cases remain unexplained after short-read genome sequencing.ย
Edd Lee, Director of Human Genomics Segment Marketing at sequencing solutions developer PacBio, which has played a lead role in driving the consortium, described the initiative as a โmuch-needed resource for the genomics research communityโ.ย
โPopulation frequency is a key tool for interpreting genetic variation. CoLoRS will extend this tool to the variation uniquely detected by HiFi sequencing, particularly structural variants, tandem repeats, and small variants in regions of the genome that are difficult to sequence using other technologies,โ added Lee.
CoLoRSโ global founder members representing research hospitals, universities, and laboratories will provide datasets for the initial set of genomes.
Data will be accessible via National Human Genome Research Instituteโs (NHGRI) Analysis, Visualization and Informatics Lab-space (AnVIL) โ a cloud-based genomic data sharing and analysis platform.
Supporting funds have been provided by the US National Institutes of Health Office of Data Science Strategy and NHGRI.ย
Michael Schatz, Bloomberg Distinguished Professor at Johns Hopkins University, USA said the new database would mean โwe will finally be able to consider all types of variation across the entire human genome.โ