The Gene Expression Omnibus (GEO) database is an international public data repository that archives and freely distributes high-throughput gene expression and other functional genomics data sets.
GEO supports MIAME-compliant data submissions. Array- and sequence-based data are accepted. Tools are provided to help users query and download experiments and curated gene expression profiles.
The three main goals of GEO are to:
- Provide a robust, versatile database in which to efficiently store high-throughput functional genomic data
- Offer simple submission procedures and formats that support complete and well-annotated data deposits from the research community
- Provide user-friendly mechanisms that allow users to query, locate, review and download studies and gene expression profiles of interest
The GEO DataSets database stores original submitter-supplied records (Series, Samples and Platforms) as well as curated DataSets. Curated DataSets form the basis of GEO’s advanced data display and analysis features, including tools to identify differences in gene expression levels and cluster heatmaps. GEO Profiles are derived from GEO DataSets. Not all original submitter-supplied records have been assembled into curated DataSets yet.
The GEO DataSets database can be searched using many different attributes including keywords, organism, DataSet type and authors. Examples and full details about how to search for GEO DataSets of interest are provided in the Querying GEO DataSets and GEO Profiles page.
The GEO Profiles database stores gene expression profiles derived from curated GEO DataSets. Each Profile is presented as a chart that displays the expression level of one gene across all Samples within a DataSet. Experimental context is provided in the bars along the bottom of the charts making it possible to see at a glance whether a gene is differentially expressed across different experimental conditions. Profiles have various types of links including internal links that connect genes that exhibit similar behavior, and external links to relevant records in other NCBI databases.
GEO Profiles can be searched using many different attributes including keywords, gene symbols, gene names, GenBank accession numbers, or Profiles flagged as being differentially expressed.
GEO2R is an interactive web tool that allows users to compare two or more groups of Samples in a GEO Series in order to identify genes that are differentially expressed across experimental conditions. Results are presented as a table of genes ordered by significance.
GEO DataSets and GEO Profiles are part of NCBI’s network of Entrez databases. As with these other databases, data of interest may be located simply by entering keywords into the GEO DataSets or GEO Profiles search boxes. The Advanced Search and Limits pages, linked at the head of the GEO DataSets and GEO Profiles pages, assist greatly in the construction of complex queries.
All the data in GEO can be downloaded in a variety of formats using a variety of mechanisms. The following information lists download options and formats.
- FTP link download
- Accession Display bar
- Construction of URL
- Programmatic Access
- Entrez GEO DataSets and Profiles query downloads