The Genome Reference Consortium (GRC) is an international collective of academic and research institutes with expertise in genome mapping, sequencing, and informatics, formed to improve the representation of reference genomes. At the time the human reference was initially described, it was clear that some regions were uncooperative to closure with existing technology. The main reason for improving the reference assemblies is that they are the cornerstones upon which all whole genome studies are based.
When the Human Genome Project was completed in 2003, many thought that meant the entire human genome had been fully sequenced. In reality there were sections of the sequence that remained unfinished, containing gaps, missing information and misaligned or misrepresented regions. To improve the sequence quality and accuracy of the human assembly, the Genome Reference Consortium (GRC) was created.
The Genome Reference Consortium (GRC) was founded in 2007 to improve the reference genome assemblies of human, mouse and zebrafish. One of the first tasks was to modernize the assembly model to make sure that complex variation within a species can be captured and represented. The GRC also guarantees INSDC submission and long term maintenance of all produced assemblies. All this is achieved through genome analysis and additional sequencing and collection of other data, for instance optical mapping.
Initially the focus lies with the Human and Mouse reference genomes, but in mid-late 2010 full maintenance and improvement of the Zebrafish genome sequence was also added to the GRC. The goal of the Consortium is to correct the small number of regions in the reference that are currently misrepresented, to close as many remaining gaps as possible and to produce alternative assemblies of structurally variant loci when necessary.
The GRC is a concerted effort which interacts with many groups in the scientific community, however the primary member institutes are;
- The Welcome Sanger Institute
- The McDonnell Genome Institute at Washington University
- The European Bioinformatics Institute
- The National Center for Biotechnology Information
Some of the problems associated with sequencing the human genome include hard to sequence repetitive or variable regions, inaccurately assembled areas and regions where no sequence exists at all. The GRC’s collective expertise in genome mapping, sequencing and informatics is helping to correct such issues. GRC is closing remaining gaps and providing alternative assemblies, where needed, to show an increased understanding of genomic structure and variation.
The GRC produces two types of assembly updates:
(1) Major releases, in which chromosome coordinates are changed
(2) Minor releases, in which chromosome coordinates do not change and updates are provided as standalone patch scaffold sequences
The consortium is very open to collaboration. They encourage scientists worldwide to report their own sequencing issues and inaccuracies, which are then systems.
As of September 2019, the major assembly releases for human, mouse, zebrafish, and chicken are GRCh38, GRCm38, GRCz11, and GRCg6a respectively. Major assembly releases do not follow a fixed cycle, however there are “minor” assembly updates in the form of genome patches which either correct errors in the assembly or add additional alternate loci.