CASP is an organization that manages community-wide experiments to measure the state-of-the-art in modeling of protein structure from amino acid sequence. The core principle of CASP is fully blinded testing of structure prediction methods, and that is what CASP has held every 2 years, since 1994.
The experiment covers an approximately 9-month period. Sequences of proteins for which the structure is about to be solved by X-ray or NMR methods are first requested from the experimental community. These sequences are distributed to registered members of the modeling community, who submit models before there is any release of the experimental data. Models are then evaluated by a battery of automated methods and assessed by independent assessors.
Experimental structures are currently available for less than 1/1000th of the proteins for which sequence is known, so modeling has a major role to play in providing structural information for a wide range of biological problems. During the almost 20 years of the CASP experiments the structure modeling field has changed enormously. In 1994, there were only 229 unique protein folds known, so that most sequences of interest had no detectable homology to known structures, and could only be modeled by “ab initio” methods. Such modeling was known as a “grand challenge” problem in computational biology and it was expected that physics methods, together with a better understanding of the process by which proteins fold, would lead to a solution. At present, there are about 87,000 structures in the Protein databank, and these span about 1393 folds, so that a homology model can be produced for in excess of half of all protein domains of known sequence. Homology models change greatly in accuracy depending on a number of factors, and for that reason CASP has encouraged the development of methods that can estimate the likely overall accuracy of a model and accuracy at the individual amino acid level. The accuracy of homology models, as monitored by CASP, has been improved dramatically, through a combination of improved methods, larger databases of structure and sequence, and feedback from the CASP process. Ab initio modeling methods have also improved substantially, from a very low base in the first CASP experiment. Refinement of initial models is also an area where more physics-based approaches are expected to contribute. CASP has focused on the issue of refinement and encouraged members of the physics community to become involved, and these efforts bore fruit in CASP10.
CASP contestants are given protein sequences whose structures have been solved by x-ray crystallography and NMR, but not yet published. Each contestant predicts the structures and submits the results to the CASP organizers before the structures are made publicly available. The results of the predictions are compared with the newly determined structures using structure alignment programs such as VAST, SARF, and DALI. In this way, new prediction methodologies can be evaluated without the possibility of bias. The predictions can be made at various levels of detail (secondary or tertiary structures) and in various categories (homology modeling, threading, ab initio). This experiment has been shown to provide valuable insight into the performance of prediction methods and has become the major driving force of development for protein structure prediction methods.