The Database of Interacting Proteins (DIP) is a biological database which catalogs experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent set of protein–protein interactions. The data stored within DIP have been curated, both manually, by expert curators, and automatically, using computational approaches that utilize the knowledge about the protein–protein interaction networks extracted from the most reliable, core subset of the DIP data. The database was initially released in 2002.
The Database of Interacting Proteins (DIP) aims to integrate the diverse body of experimental knowledge about interacting proteins into a single, easily accessed database. Biological knowledge about protein–protein interactions is contained in many different scientific journals and in archives such as MEDLINE.
The DIP database catalogs experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent set of protein-protein interactions. The data stored within the DIP database were curated, both, manually by expert curators and also automatically using computational approaches that utilize the knowledge about the protein-protein interaction networks extracted from the most reliable, core subset of the DIP data.
The primary goal of DIP is to extract and integrate the wealth of information about protein–protein interactions into a user-friendly environment. Although organism-specific databases such as;
- YPD for yeast
- EcoCyc for Escherichia coli
- FlyNet for Drosophila
DIP can be searched through its web interface searches may be based on the interactions.
In its original conception, information on protein interaction was stored in the DIP as a single text file. To handle effectively the growing body of data, the DIP has now been implemented as a relational database written in the programming language SQL, specifically mySQL. SQL efficiently handles diverse types of data and enables rapid sorting and analysis. The database can be conveniently extended as required, without altering the existing database content, by adding new fields and tables to the data structure.
The DIP database is composed of three linked tables: a table of protein information, a table of protein–protein interactions, and a table describing details of experiments detecting the protein–protein interactions.
DIP can be searched in a variety of ways. One can look for interactions involving a specific protein by entering its gene name or its accession code from GenBank, PIR or SWISS-PROT. More general searches can be performed for information such as organisms, protein superfamilies, keywords, experimental techniques or literature citations. Multiple fields can be searched simultaneously to narrow the query, and the use of wildcards and regular expressions is supported to further aid in searching. A search returns a list of protein–protein interactions, each hyperlinked to a DIP entry. Each resulting DIP entry reports information about the two interacting proteins, the protein domains and range of amino acids involved, the curator, date of entry and updating and the articles describing the interaction, and the corresponding experiments. For example, a search on a single protein returns all of the interactions recorded in DIP in which that protein participates.