NPDock is a novel web server developed for protein–nucleic acid docking that uses specific protein–nucleic acid statistical potentials for scoring and selection of modeled complexes. NPDock implements a unique workflow based on a combination of computational methods that have been published and offers a user-friendly web interface to enter PDB structures and view the results. The automation of the entire process makes the protein–nucleic acid docking available to users who would otherwise become tripped up installing many complex programs locally and then carrying out many manual steps; each requiring a variety of manual format conversions that are highly prone to human error. Therefore, it can help users save even more than ten times the time required to run different methods separately and sequentially. Future plans are as follows. First, add additional potentials for protein–nucleic acid interactions (in particular for protein–RNA interactions), including potentials developed by third parties, as well as ones under development in our group.
The smallest set of input data consists of a protein structure file and a nucleic acid (DNA or RNA) structure file in PDB, AMBER, CHARMM or mol2 formats. Mol2-formatted files are converted to the PDB format using OpenBabel. AMBER- and CHARMM-formatted files are converted into the parser written in Python. For the submission of the input files, the NPDock server checks the size of both molecules. Because of vector size limitations in the GRAMM program, the server can only process files up to 10 000 atoms each, and for larger files, it aborts the prediction and reports a problem. Input structures must be formatted properly to be accepted by the GRAMM program. Any atom that should be taken into consideration during the docking process, including non-standard residues and ligands, must be relabeled as ‘ATOM’: atoms labeled ‘HETATM’ in the input files will be ignored and may appear as ‘holes’ in the output structures. Improperly named atoms will not interfere with the docking process and will appear in the output.
In addition to the main input files, the user is able to modify parameters used in the docking process. In particular, the input can include a list of interface residues for both the receptor and the ligand, and the number of protein–nucleic acid residue pairs that are required to be in contact such as at a distance ≤10 Å from each other.
The user can also modify the parameters in the clustering procedure. By default, the 100 best-scored models are used for clustering. The default value for the RMSD threshold in the clustering procedure is set to 5 Å, which can be modified (typically increased) for structures that are very large or generate a large number of different poses.
The parameters of the rigid body refinement procedure can be also adjusted by the user. The number of simulation steps defines the length of the simulation. Long simulations allow the molecules to move out from a local minimum and may allow a more extensive sampling of the conformational landscape. A high temperature allows for a larger freedom of movement; however, it should be combined with a longer simulation time to allow the system to cool down smoothly.