The CRISPR-Cas system is a prokaryotic immune system that confers resistance to foreign genetic elements such as those present within plasmids and phages and provides a form of acquired immunity. RNA harboring the spacer sequence helps Cas (CRISPR-associated) proteins recognize and cut foreign pathogenic DNA.
CRISPR–Cas systems have only been discovered within the past decade during which they attracted considerable interest. The molecular understanding of some of their enzymatic components, e.g., the Cas9 protein, has been exploited to develop new tools for genome engineering and gene regulation that are easier to generate than existing technologies such as ZFNs and TALENs. CRISPR–Cas systems are present in most archaea and in 10–40% of bacteria. They are typically known as adaptive and heritable immune systems in the sense that microorganisms acquire resistance to extra-chromosomal elements, such as viruses or plasmids. This is gained by integrating short DNA sequences into the CRISPR loci in their genomes that act as memory of former infections. This step, still poorly understood at the molecular level, is called adaptation. Each integrated sequence (also called spacer) is separated from the next spacer by a short identical repeat that often is palindromic. The name of CRISPR was first derived from the discovery of these islands, i.e., clusters of regularly interspaced repeats.
Following transcription of the CRISPR locus, the repeats are specifically recognized by a ribonuclease that generates small RNAs, also called crRNAs (for CRISPR–RNAs). The crRNAs are then integrated into large monomeric or multimeric protein complexes formed by the CRISPR-associated proteins (Cas proteins), which scan the cellular nucleic acids for the presence of a target sequence. When a nucleic sequence complementary to the crRNA is encountered, it will be degraded either by the ribonucleoprotein (RNP) complex itself or by recruitment of an additional factor displaying the nuclease activity. This stage is called interference and the target sequence is named the protospacer, referring to a previously encountered DNA sequence.
Bioinformatics analysis of the Cas proteins allowed the classification of the CRISPR systems into different types and subtypes. The classification is of up to five types (from type I to V) among which the types I, II, and III represent the best studied systems. A common feature for all types is the presence of the proteins Cas1 and Cas2, which are involved in the capture and integration of new spacers in the adaption stage as well as the presence of a crRNA-containing RNP complex for target recognition at the interference stage. By integrating crRNAs with a specifically designed sequence, these RNP complexes can be reprogrammed to recognize practically any target of choice. The different types of CRISPR–Cas systems use different RNP complexes and further distinguish themselves by the presence of a specific “signature protein” that is responsible for DNA degradation which is, respectively, Cas3, Cas9, and Cas10 for the types I, II, and III. Type I systems use a large multi-subunit RNP complex called Cascade that recognizes double-stranded DNA (dsDNA) targets. After target recognition and verification, Cascade recruits the signature protein Cas3, a fused helicase nuclease to degrade DNA. In type II systems, the monomeric Cas9 protein is both the RNP for dsDNA target recognition as well as the signature nuclease for target degradation. Using its two nuclease domains, it readily generates a double-strand break on bound targets. It represents a minimal system and therefore became the preferred tool in CRISPR–Cas-based genome engineering applications. In type III systems, the RNP complex is multimeric with a similar helicoid structure as found for Cascade (Cas7 family proteins). Despite this similarity, the RNP complex is not recognizing dsDNA but complementary RNA sequences .
CRISPR-Cas systems are classified based on their structural and functional characteristics. There are two classes of systems, each having different types and even subtypes. In general, class 1 systems represent a majority of the identified CRISPR systems at present and are structurally very complicated. Class 2 systems are seemingly rarer in nature, but have more value to genetic engineering and biotechnology, and have fewer components.