Overview

Accessing to the service

Workflow

Open screening endeavors play and will play key role in order to facilitate the identification of new bioactive compounds for drug discovery and chemical biology purposes. Such open access tools are critical to advance the success of drug discovery projects in particular for academic groups. This page describes how to use and run MTiOpenScreen web server proposing two services: MTiOpenScreen and MTiAutoDock dedicated to small molecule docking and chemical library virtual screening. The services are hosted on the Mobyle Portal. For specific requirements or reporting possible bugs, please write to us.

Workflow

How MTiAutoDock works ?

Blind Docking using AutoDock

The Lamarckian genetic algorithm (LGA) [1] as implemented in AutoDock 4.2.6 is used to generate orientations/conformations of the compound. Ten docking runs are performed, with an initial population of 150 random individuals and a maximum number of 2,500,000 energy evaluations. The automatically generated grid envelops the entire protein structure.

Binding Site Docking using AutoDock

The Lamarckian genetic algorithm (LGA) [1] is used to generate orientations/conformations of the compound. Ten docking runs were performed, with an initial population of 150 random individuals and a maximum number of 2,500,000 energy evaluations. The grid dimensions and center should be provided by the user or can be automatically calculated based on the list of protein residues of the binding site provided by the user.

How MTiOpenScreen works ?

Virtual Screening using AutoDock Vina

AutoDock Vina [2] docking employs a gradient-based conformational search approach and defines the search space by a grid box defined by the box center coordinates and its dimensions of x, y and z. In AutoDock Vina the grid resolution is internally assigned to 1Å. We use number of binding modes of 10 and exhaustiveness of 8. The grid dimensions and center should be provided by the user or can be automatically calculated based on the list of protein residues of the binding site provided by the user. The scoring of the generated docking poses and ranking of the ligands is based on the Vina empirical scoring function approximating the binding affinity in kcal/mol.

Predefined Chemical Compound Collections

To provide valuable starting points for open virtual screening, we provide five electronic drug-like chemical libraries: a diverse chemical compound collection (Diverse-lib) and a focused chemical compound collection (iPPI-lib) to target protein-protein interactions (PPI), a collection of purchasable approved drugs (Drugs-lib), a food constituent compound collection (FOOD-lib) and a natural product compound collection (NP-lib). For the Diverse-lib preparation, we downloaded 12 chemical libraries from PubChem BioAssay Database [3] assembling thus 3,574,650 molecules (names provided in pubchem SID). After removing the redundant molecules, we employed an in-house developed “soft” drug-like filter using the FAFDrugs3 web-server [4] to remove molecules with undesired physicochemical properties (for the drug-likeness parameters see here). Toxic and PAINS (Pan Assay Interference Compounds) groups as defined in FAFDrugs3 were removed (for the toxicophores decided to be removed see here). To ensure chemical diversity, the filtered drug-like 384,372 molecules were then clustered using the Cluster Molecule Protocol (Accelrys Pipeline Pilot v8.5) with the FCFP-4 fingerprint using a maximum distance of Tanimoto of 0.3 in the clusters. Finally, we generated 3D conformations of 99,288 diverse drug-like PubChem molecules, which constitute our predefined diverse compound collection Diverse-lib.

The second provided chemical collection focused to target PPI, iPPI-lib, was prepared starting from the drug-like compound collection containing 384,372 molecules. We then employed PPI-HitProfiler [5] to select PPI-friendly compounds. The program PPI-HitProfiler is based on a machine learning model that was previously trained on 66 chemically diverse low MW inhibitors of PPI and validated using experimental results of 500,000 compounds screened on four PPI targets. The remained 204,728 molecules were then clustered using the Cluster Molecule Protocol (Accelrys Pipeline Pilot v8.5) with the FCFP-4 fingerprint using a maximum distance of Tanimoto of 0.3 in the clusters. Finally, we generated 3D conformations of 51,232 drug-like molecules likelihood to inhibit PPI.

The 3D structures of the two previous collections, the diverse one Diverse-lib and the PPI-focused one iPPI-lib, were generated using the freely available web-server Frog2 [6]. The procedure was launched keeping a maximum of 1 stereoisomer per compound without generating multiple ring conformations. The molecules were finally protonated at pH 7 using the major macrospecies option of the ChemAxon calculator plugins.

The Drugs-lib, FOOD-lib and NP-lib were built using the same protocol [7] that includes the use of FAF-Drugs4 [8] physico-chemical and toxicophore filtering, a visual inspection step to remove compounds not suitable for docking and the assessment of their purchasability according to the ZINC15 database [9]. The Drugs-lib was constructed by using as a starting point four databases of approved drugs, the "drug" subset of the ChEMBL database [10], the "approved" subset of DrugBank [11] version 5.0.10, the DrugCentral online compendium [12] and the "approved" SuperDrug2 database version 2.0 [13]. A total of 21276 compounds were initially included in the Drugs-lib construction process, but only 7173 stereoisomers corresponding to 4574 single isomer drugs were kept after removal of the duplicates and the filtering protocol mentioned in the method section. The 26941 food constituents of the FooDB with available smiles strings were used as input for the FOOD-lib generation. After removal of the duplicates and the filtering steps, the FOOD-lib included 10997 stereoisomers corresponding to 3015 single isomer food constituents. A purchasable diverse library of natural products of 1237 compounds [14] was processed to construct the NP-lib that gathers, after the filtering steps, 1228 stereoisomers corresponding to 653 single isomer natural products.

Features

  • automated ligand docking into entire protein surface for up to 10 ligands allowing to detect druggable cavities using AutoDock 4.2
  • automated ligand docking into a defined binding site for up to 10 ligands using AutoDock 4.2
  • automated virtual ligand screening via AutoDock Vina-based docking using a user-uploaded chemical library containing up to 5,000 small organic molecules
  • automated virtual ligand screening via AutoDock Vina-based docking using a chemical library of up to 10,000 small organic molecules selected from a provided predefined chemical library Diverse-lib containing 99,288 diverse drug-like molecules
  • automated virtual ligand screening via AutoDock Vina-based docking using a chemical library of up to 10,000 small organic molecules selected from a provided predefined chemical library iPPI-lib containing 51,232 drug-like molecules likelihood to inhibit protein-protein interactions (PPI)
  • visualization in 3D of the top scored poses of the 10 ligands docked by AutoDock and the protein receptor
  • visualization in 3D of the top scored pose of the 100 top ranked ligands docked by AutoDock Vina and the protein receptor
  • download of the 10 top scored poses for the 10 ligands docked by AutoDock
  • download of the top scored 3 poses for the 1500 top ranked compounds among 10,000 compounds screened using AutoDock Vina

Limitations

  • Compound library: For now, MTiOpenScreen does not prepare the chemical library provided by the user, which is limited to 5,000 compounds
  • Compound library: Virtual screening of MTiOpenScreen is limited to 10,000 ligands selected by flexible physico-chemical criteria from a predefined libraries containing 99,288 diverse or 51,232 PPI-friendly drug-like molecules
  • Ligands: MTiAutoDock and MTiOpenScreen proceed small ligands not exceeding 300 atoms
  • Protein receptor: For now, MTiOpenScreen does not allow to consider protein receptor flexibility
  • The maximum allowed grid dimensions for MTiAutoDock are 200x200x200. The default grid resolution of 0.375 Å will be replaced by 0.6 or 0.8 Å in cases where 0.375 Å is not sufficent to enter the entire protein receptor. Due to computational time limitations MTiOpenDocking cannot treat receptors bigger than 160 Å per side.

Usage

Accessing the service

Submitting a new job

The “New Job” pages of MTiAutoDock and MTiOpenScreen guide the user for the job submission. New jobs can be easily submitted by providing the protein receptor structure in PDB or MOL2 (protonated protein structure is required for MOL2) along with binding site definition, and ligands to be uploaded or drug-like compounds provided in the MTiOpenScreen webserver to be selected.

The user can chose to proceed docking of up to 10 ligands for a protein receptor provided by the user via AutoDock; virtual screening of up to 10,000 compounds on one protein conformation using AutoDock Vina.

MTiAutoDock

For protein receptors, if they are provided in PDB, the structure are cleaned and preprocessed automatically: all HETATMs are removed and hydrogens are added to the structure using MGLTools. In most cases, these procedures will produce a clean receptor structure. Otherwise, users can prepare the protein structure manually and upload it in MOL2 format. If the protein is uploaded in MOL2, the structure will be used as-is without any modification. If MOL2 is used, please remove all the solvent molecules and add all hydrogens. If the protein of a bound complexe is used as the receptor, please do remember to remove the native ligand structure from the complex before uploading.

The user should define the binding site (except for the blind docking) by providing the center and the grid dimensions, or a list of residues defining the binding site.

For ligands, a unique conformation should be given for up to 10 small organic molecules with added all hydrogen atoms in a SDF or MOL2 file. For correct docking results the number of ligand atoms should not exceed 300 atoms. Please, make sure that all the atom types and the bond types of the molecules are correct.

MTiOpenScreen via AutoDock Vina

User has the possibility to select one of the following options:

  1. Ligands uploaded by the user. They should not exceed 5,000 for virtual screening. A unique conformation in 3D should be provided with added all hydrogen atoms in a proper MOL2 or SDF format file. For correct docking results the number of ligand atoms should not exceed 300 atoms. Please, make sure that all the atom types and the bond types of the molecules are correct.
  2. Up to 10,000 ligands can be taken randomly or selected by using physico-chemical criteria defined by the user from a predefined compound library Diverse-lib containing 99,288 diverse drug-like molecules
  3. Up to 10,000 ligands can be taken randomly or selected by using physico-chemical criteria defined by the user from a predefined compound library iPPI-lib containing 51,232 drug-like molecules likelihood to inhibit PPI.
  4. Use the Drugs-lib containing 7173 purchasable approved drugs.
  5. Use the FOOD-lib containing 10997 purchasable food constituents.
  6. Use the NP-lib containing 1226 purchasable natural products.
Prepare protein receptor structures

For protein receptors, if they are provided in PDB, the structure are cleaned and preprocessed automatically: all HETATMs are removed and hydrogens are added to the structure using MGLTools. In most cases, these procedures will produce a clean receptor structure. Otherwise, users can prepare the protein structure manually and upload it in MOL2 format. If the protein is uploaded in MOL2, the structure will be used as-is without any modification. If MOL2 is used, please remove all the solvent molecules and add all hydrogens. If the protein of a bound complexe is used as the receptor, please do remember to remove the native ligand structure from the complex before uploading.

The user should define the binding site (except for the blind docking) by providing the center and the grid dimensions, or a list of residues defining the binding site.

Running the service
  • You can now run MTiAutoDock or MTiOpenScreen by clicking on the Run button (1).
  • Clicking on the Reset (2) button will clear all input data you entered, and stop the process.

Upon successful submission of a new job, a unique token will be given to identify the job. Use this token to check the status of the job and retrieve the results.

Retrieve results

When the job finishes, the results will be parsed and visualized. The receptor and docked compounds will be visualized in 3D:

  • visualization of the top scored poses of the 10 ligands docked by AutoDock ranked by binding energy
  • visualization of the top scored pose of the 100 top ranked ligands screened via docking with AutoDock Vina ranked by binding energy.

The virtual screening results are also available for downloading. The user can download:

  • the 10 top scored poses for the 10 ligands docked by AutoDock in pdbqt or mol2; binding energies
  • the 3 top scored poses for the 1500 top ranked compounds screened using AutoDock Vina ranked by binding energy; binding energies; physico-chemical descriptors of ligands

Please, note that bioactive compounds are rarely ranked in the top 1 % of the virtually screened compound library. Thus, the visualized here top ranked 100 ligands using MTiOpenScreen are only an illustration of the docking results and the binding site. You may need to proceed a thorough analysis of a larger number of ligands among the 1500 top ranked compounds provided to download by using standalone programs like PyMOL or AutoDockTools for a final selection of the best potential compound candidates.

The data associated with any job will be kept for only a limited time period. All jobs terminated due to any error will be deleted within 24 hours and all successfully finished jobs will be kept for only 30 days.

Benchmark

MTiOpenScreen has been rigorously benchmarked for the performance of our implementation of AutoDock and Vina software. Docking accuracy has been validated on 27 crystal structures of protein-drug complexes taken from the DrugPort database of EMBL-EBI. We selected several classes of important protein targets (like nuclear receptors, GPCR, kinases, serine proteases, PPI among others) with available high quality crystal structures that have been manually verified for the electronic density quality using VHELIB. The ability to discriminate small-molecule binder from non-binder compounds has been assessed on two enzymes (the catalytic domains of the RTK VEGFR2 and the serine protease Coagulation Factor Xa) and on the PPI target Bcl-xl using up to 40 diverse active compounds and ~1000 diverse decoys per protein target. An example is shown for the enrichment obtained for the catalytic domain of the serine protease Coagulation Factor Xa after virtual screening with MTiOpenScreen.

References

[1] Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, Olson AJ.
AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility.
J Comput Chem. 2009, 30(16):2785-91.
[2] Trott O, Olson AJ.
AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading.
J Comput Chem. 2010, 31(2):455-61.
[3] Wang Y, Suzek T, Zhang J, Wang J, He S, Cheng T, Shoemaker BA, Gindulyte A, Bryant SH.
PubChem BioAssay: 2014 update.
Nucleic Acids Res. 2014, 42(Database issue):D1075-82.
[4] Lagorce D, Maupetit J, Baell J, Sperandio O, Tufféry P, Miteva MA, Galons H, Villoutreix BO.
The FAF-Drugs2 server: a multistep engine to prepare electronic chemical compound collections.
Bioinformatics. 2011, 27(14):2018-20.
[5] Reynès C1, Host H, Camproux AC, Laconde G, Leroux F, Mazars A, Deprez B, Fahraeus R, Villoutreix BO, Sperandio O.
Designing focused chemical libraries enriched in protein-protein interaction inhibitors using machine-learning methods.
PLoS Comput Biol. 2010, 6(3):e1000695.
[6] Miteva MA, Guyon F, Tufféry P.
Frog2: Efficient 3D conformation ensemble generator for small compounds.
Nucleic Acids Res. 2010 38(Web Server issue):W622-7.
[7] Lagarde N, Rey J, Gyulkhandanyan A, Tufféry P, Miteva MA, Villoutreix B.
Online structure-based screening of purchasable approved drugs and natural compounds: retrospective examples of drug repositioning on cancer targets.
Publication submitted.
[8] Lagorce D, Bouslama L, Becot J, Miteva MA, Villoutreix BO.
FAF-Drugs4: free ADME-tox filtering computations for chemical biology and early stages drug discovery.
Bioinformatics. 2017, 33(22):3658-60.
[9] Sterling T, Irwin JJ.
ZINC 15--Ligand Discovery for Everyone.
J Chem Inf Model. 2015, 55(11):2324-37.
[10] Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP.
ChEMBL: a large-scale bioactivity database for drug discovery.
Nucleic Acids Res. 2012, 40(Database issue):D1100-7.
[11] Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, Sajed T, Johnson D, Li C, Sayeeda Z, Assempour N, Iynkkaran I, Liu Y, Maciejewski A, Gale N, Wilson A, Chin L, Cummings R, Le D, Pon A, Knox C, Wilson M.
DrugBank 5.0: a major update to the DrugBank database for 2018.
Nucleic Acids Res. 2018, 46(D1):D1074-82.
[12] Ursu O, Holmes J, Knockel J, Bologa CG, Yang JJ, Mathias SL, Nelson SJ, Oprea TI.
DrugCentral: online drug compendium.
Nucleic Acids Res. 2017, 45(D1):D932-9.
[13] Siramshetty VB, Eckert OA, Gohlke BO, Goede A, Chen Q, Devarakonda P, Preissner S, Preissner R.
SuperDRUG2: a one stop resource for approved/marketed drugs.
Nucleic Acids Res. 2018, 46(D1):D1137-43.
[14] O'Hagan S, Kell DB.
Analysing and Navigating Natural Products Space for Generating Small, Diverse, But Representative Chemical Libraries.
Biotechnol J. 2018, 13(1).