The average execution time for the Benchmark 4.0 (176 structures) was ca. 23 hours (the shortest: 3 hours, the longest: 6 days). The average execution time for a subset of 29 structures being executed in parallel was ca. 1 day 5 hours (the shortest: 20 hours, the longest: 1 day 13 hours).
Please send any comments or/and bug reports to: lif-swarmdock [at] crick.ac.uk.
ScopeThe web service [1,2] is for the flexible modelling of protein-protein complexes using the SwarmDock algorithm which incorporates a normal modes approach (only for short peptides normal modes are not included) [3,4]. We were able to significantly improve top10 success rate by filtering solutions with low maximum equilibrium population . Uploaded structures (in PDB format) of ligand and receptor must obey to only three simple rules:
- Files must have TER record after each chain (also after last one).
- Generally, only standard residues are allowed. However, our server recognises (as ATOM or HETATM) the following:
- PCA, ABA, AIB, DAL, ORN and changes them to ALA,
- DAR and changes it to ARG,
- DSG and changes it to ASN,
- ASX, DSP and changes them to ASP,
- SEC, CSD, CSW, OCS, DCY, CEA, CSO, CSS, CSX, CME and changes them to CYS,
- DGN and changes it to GLN,
- GLX, CGU, 5HP, DGL and changes them to GLU,
- SAR and changes it to GLY,
- HSE, HSP, HSD, DHI and changes them to HIS,
- DIL and changes it to ILE,
- XLE, DLE, MLE, NLE and changes them to LEU,
- DLY, KCX, LLP and changes them to LYS,
- MSE, CXM, FME and changes them to MET,
- DPN, TPQ and changes them to PHE,
- HYP, DPR and changes them to PRO,
- DSN, SEP and changes them to SER,
- BMT, DTH, TPO and changes them to THR,
- DTR and changes it to TRP,
- PTR, TYS, DTY, STY and changes them to TYR,
- DVA, DIV, MVA and changes them to VAL.
- Submitting files with missing residues is not encouraged. However, we will try to repair your files (by modelling loops with ALA) to make them ready for our server.
- Preprocessing (checking for structural correctness, modelling missing and non-standard residues, structure minimisation).
- Docking (point generation and running PSO).
- Postprocessing (structure minimisation, rescoring and clustering).
- Results returned, an archive with PDB formatted structures for members of each cluster. Additional files:
- clusters_democratic.txt (hierarchical clustering at 3.0 Å with democratic scoring scheme ),
- clusters_standard.txt (hierarchical clustering at 3.0 Å with Tobi potential; list of results in format: pdb file, number of members in the cluster, total number of contacts between receptor and ligand with cut-off at sum of van der Waals radii + 20%, number of contacts for receptor's residue list submitted by user, number of contacts for ligand's residue list submitted by user, mean energy of the cluster and its standard deviation),
- *.contacts (list of contacts with cut-off at sum of van der Waals radii + 20%, R-receptor, L-ligand, UR-user receptor, UL-user ligand),
- energies_tobi.txt (list of solutions with corresponding values of potential developed by Tobi),
- best10_democratic.pdb (highest score structures of the first ten clusters, hierarchical clustering with democratic scoring scheme),
- best10_standard.pdb (lowest energy structures of the first ten clusters, hierarchical clustering with Tobi potential),
- ligand.pdb and receptor.pdb (files used as an input, may be different from these uploaded by the user because of repairs),
- uploaded_ligand.pdb and uploaded_receptor.pdb (files uploaded by the user),
- job.txt (details about submitted job),
- files from the procedure of filtering away solutions with low maximum equilibrium population  with clusters_standard.txt as the input (FILTERING subfolder; see below for details): network_beforeFiltering.ratrav (RaTrav format), network_afterFiltering.ratrav (RaTrav format), occupancies_beforeFiltering.txt, occupancies_afterFiltering.txt, network_beforeFiltering.gml (GML format), network_afterFiltering.gml (GML format), subnetworks_beforeFiltering.txt, energies_opus.txt, clusters_afterFiltering.txt, best10_afterFiltering.pdb.
Files in RaTrav format may be used directly with RaTrav (http://sourceforge.net/projects/ratrav/) for further analysis, e.g. mean first passage time calculations. Files in GML format may be used directly with Gephi (http://www.gephi.org/) for various analysis.
If you wish to choose residues belonging to the binding site, we will provide you with information on the accessibility and conservation of the binding site residues. Residues are ordered due to the product of these two factors.
NEW SCORING SCHEME: IRaPPA (Integrative Ranking of Protein-Protein Assemblies)In this scoring scheme, the poses are scored using the molecular descriptors in the CCharPPI web server (http://life.bsc.es/pid/ccharppi/) as well as clustered at several levels of resolution. The scores and cluster sizes are combined using an ensemble of ranking models . Results returned using this scoring sceme are flagged "Democratic" .
Filtering away non-funnel-like energy structures (available since version 13.08.20)Even if correct solution is somewhere in the set of solutions, it may not be present in the top10 list. We were able to improve significantly top10 success rate using Markov chain theory, i.e. by filtering away solutions with low maximum equilibrium population .
A network of conformational states is created and a link is formed if two structures have ligand Cα RMSD < 6.0Å. Transition probabilities are computed based on differences in the values of OPUSPSP potential . Calculated occupancy probabilities (by diagonalization of Markov matrix) are multiplied by the number of nodes in the network, maximum occupancy for each subgraph is assigned to all structures in this subgraph. Structures with occupancies < 2.1 are filtered away (see  for details on the chosen threshold).
Implementation of the method in SwarmDock Server brings additional output files (FILTERING subfolder):
- network_beforeFiltering.ratrav, network_afterFiltering.ratrav (networks before and after filtering defined in RaTrav format with weight of edge defined as Cα RMSD; PDB equivalent of each node ID may be found in occupancies_beforeFiltering.txt and occupancies_afterFiltering.txt, respectively),
- network_beforeFiltering.gml, network_afterFiltering.gml (networks before and after filtering defined in GML format with calculated transition probabilities and weights of edges defined as Cα RMSD; occupation probabilities are calculated with all weights of edges equal 1.0 as in ),
- occupancies_beforeFiltering.txt, occupancies_afterFiltering.txt (Network Node, PDB Structure, Occupancy, Max Occupancy Assigned; calculated with all weights of edges equal 1.0 as in ),
- subnetworks_beforeFiltering.txt, subnetworks_afterFiltering.txt (list of PDB structures forming each subnetwork before any filtering),
- energies_opus.txt (list of solutions with corresponding energies used for transition probabilities calculations; in case of huge structures OPUSPSP may fail and 'ERROR' will be visible in the output file; consequently filtering won't be performed),
- clusters_afterFiltering.txt (filtered clusters.txt file),
- best10_afterFiltering.pdb (lowest energy structures of the first ten clusters after filtering).
ExampleLet's assume that we want to dock a complex 2OUL. We have input files for both the receptor, TER_2OUL_r_u.pdb, and the ligand, TER_2OUL_l_u.pdb, with an added TER record after each chain. We submit it to the server as a full blind docking case (with default number of normal modes set equal to 5 for both receptor and ligand). Results for this submission are returned via the following link, allowing for some visualisation of the clustered solutions (using Jmol  and Gephi ):
SwarmDock Server as a repairment serviceSwarmDock may also serve you by repairing your PDB files, even if you want to use them with other docking servers. In order to repair your structures, choose 'I want to choose interface residues'. After the repairment stage you will receive a link to a webpage where you can download the repaired PDB files; you don't have to resubmit the job for docking.
SwarmDock Server as an interface prediction serviceIn restrained docking mode the user is supported with a simple interface prediction tool based on solvent accessibility and residues conservation. The method was benchmarked on the full Benchmark 4.0 (176 structures; 52 enzyme/inhibitor, 25 antibody/antigen and 99 others). For each unbound receptor and ligand PDB structure, residue accessibility and conservation were computed. Then, each residue was ranked according to the products of its accessibility and conservation. True positive interface residues were known from bound PDB structures. Finally, it was checked how often at least one correctly predicted residue was found in the first five or ten predictions. The results are as follows:
- At least one correctly predicted residue in the first five returned residues for receptor: 75% (enzyme/inhibitor), 16% (antibody/antigen), 69% (others).
- At least one correctly predicted residue in the first five returned residues for ligand: 88% (enzyme/inhibitor), 32% (antibody/antigen), 70% (others).
- At least one correctly predicted residue in the first ten returned residues for receptor: 85% (enzyme/inhibitor), 24% (antibody/antigen), 82% (others).
- At least one correctly predicted residue in the first ten returned residues for ligand: 98% (enzyme/inhibitor), 60% (antibody/antigen), 89% (others).