In bioinformatics, a genome browser is a graphical interface for showing information from a biological database for genomic data. Genome browsers enable researchers to visualize and browse entire genomes with annotated data including gene prediction and structure, proteins, expression, regulation, variation, comparative analysis, etc. Annotated data is usually from multiple diverse sources. They differ from ordinary biological databases in that they display data in a graphical format, with genome coordinates on one axis with annotations or space-filling graphics to show analyses of the genes, such as the frequency of the genes, their expression profiles, etc.
A large number of genome browsers are available, many of them free and with databases accessible online. Among the best known are the UCSC Genome Browser, Ensembl Genome Browser and NCBI’s Genome Data Viewer. These genome browsers may support multiple genomes, however, other genome browsers may be specific for particular species. These browsers may provide summary of data from genomic databases and comparative assessment of different genetic sequences across multiple species, and allow the data to be visualized in many ways to facilitate assessment and interpretation of these complex data. The displays of the top and bottom genome coordinates are independent and can be easily scrolled horizontally or zoomed in and out to visually compare inter-chromosomal and intra-chromosomal interactions.
Scientists often need to visualize and analyze their own in-house data together with data available in the public domain and to collect multiple sources of public data to perform their own analyses. The scientific community actively shares generated data with the public by creating tracks or track-hubs, such as those collated in the Track Hub Registry.
One of the key challenges for genome browsers is the visualization of diverse types of data generated by different technologies and of the interactions and relationships between different elements from different genomic regions.
Web-based genome browsers can be classified into general genome browsers with multiple species and species-specific genome browsers.
Currently, there are two types of web-based genome browsers. The first type is the multiple-species genome browsers implemented in, among others, the UCSC genome database, the Ensembl project, the NCBI Map viewer website, the Phytozome and Gramene platforms. These genome browsers integrate sequence and annotations for dozens of organisms and further promote cross-species comparative analysis. Most of them contain abundant annotations, covering gene model, transcript evidence, expression profiles, regulatory data, genomic conversation, etc. Each set of pre-computed annotation data is called a track in genome browsers. The essence of a genome browser is to pile up multiple tracks under the same genomic coordinate along the Y-axis, thus users could easily examine the consistency or difference of the annotation data and make their judgments of the functions or other features of the genomic region.
The other type is the species-specific genome browsers which mainly focus on one model organism and may have more annotations for a particular species. Powered by the Generic Model Organism Database (GMOD) project, dozens of open-source software tools are collected for creating and managing genome biological databases, and the GBrowse framework is one of the most popular tools in the GMOD project. Currently, most of these species-specific genome browsers are implemented based on the GBrowse framework, such as MGI, FlyBase, WormBase, SGD and TAIR.
A genome browser is a computer program that displays genomic data in a user-friendly manner. It takes typically very large files, such as whole genome FASTA files and displays them in a way that we users can make sense of the information there. In most cases, genome browsers are designed to integrate different types of data represented in different types of data files. For example, annotation files, those that contain information about the location of genes in the genome, can also be loaded into a genome browser so we can visually inspect the location of the genes. This is important because we can interpret results more intuitively when we can see information in a genomic context and not in isolation.