REPPER (REPeats and their PERiodicities) is an integrated server that detects and analyzes regions with short gapless repeats in protein sequences or alignments. It finds periodicities by Fourier Transform (FTwin) and internal similarity analysis (REPwin). Amino acid repeats (AARs) are abundant in protein sequences. They have particular roles in protein function and evolution. Simple repeat patterns generated by DNA slip pages tend to introduce length variations and point mutations in repeat regions.
The neural network-based method is another well-studied pattern-recognition strategy, which is also capable of identifying similar patterns in protein sequences. A well-established neural network is able to associate homologous patterns in the protein sequence with the input patterns and can be trained to adapt the patterns. Several neural network algorithms show good accuracy and time efficiency on protein homologue detection. LSTM is able to combine amino acid properties with patterns and does not rely on pre-defined scoring matrices for similarity measurements. The ARD neural network is designed to identify specific alpha-rod repeat patterns and has been applied to the analysis of Huntingtin protein sequences.
FTwin assigns numerical values to amino acids that show certain properties, for instance hydrophobicity, and gives information on corresponding periodicities. REPwin uses self-alignments and displays repeats that reveal significant internal similarities. Both programs use a sliding window to ensure that different periodic regions within the same protein are detected independently. FTwin and REPwin are complemented by secondary structure prediction (PSIPRED) and coiled coil prediction (COILS), making the server a versatile analysis tool for sequences of fibrous proteins.
RAPPER is an ab initio conformational search algorithm for restraint-based protein modelling. It has been used for all-atom loop modeling, whole protein modelling under limited restraints, comparative modelling, ab initio structure prediction, structure validation and experimental structure determination with X-ray and nuclear magnetic resonance spectroscopy.
In REPPER (REPeats and their PERiodicities), the programs FTwin and COILS allow the user to take a multiple sequence alignment as input, and there is also the option to calculate a profile for a given single input sequence using PSI-BLAST with two iterations and an E-value cutoff of 0.001.
Many proteins display repeat patterns in their sequences. The size of these repeats may range from entire domains, such as the IG and FN domains in titin, over subdomain-sized supersecondary structures, such as the α–α hairpins in TPR proteins or the β-meanders in β-propellers, to the short elements making up fibrous proteins, such as coiled coils, collagens and β-helices.
Most currently available repeat detection tools are homology-based and built to identify divergent, gapped repeats of variable length and spacing in the size range of 20 residues and above. For example,
In conjunction with programs that predict secondary structure and the occurrence of coiled coils, FT can be very powerful in the analysis of fibrous proteins. In addition, these methods can be usefully complemented by a sequence comparison tool (REPwin), but tailored to detect short consecutive repeats by aligning a sequence to itself, shifted by multiples of a variable offset. Therefore, a server has built that implements new versions of FT (FTwin) and sequence self-comparison (REPwin) and combines their output with that of secondary structure prediction (PSIPRED) and coiled coil prediction (COILS) into an integrated and detailed overview.
In REPPER (REPeats and their PERiodicities), the program’s FTwin and COILS allow the user to take a multiple sequence alignment as input, and there is also the option to calculate a profile for a given single input sequence using PSI-BLAST with two iterations and an E-value cutoff of 0.001.