PatchSearch

A service to identify structurally conserved regions at the protein surfaces.

Overview

Many therapeutic molecules are known to bind several proteins, which can be different from the initially targeted one. Such unexpected interactions with proteins called off-targets can lead to adverse effects. Potential off-target identification is important to predict to avoid drug side effects or to discover new targets for existing drugs.

This service implements the PatchSearch method [1], which allows local nonsequential searching for similar regions, called patches, on protein surfaces. It is based on detection of quasi-cliques in product graphs representing all the possible matchings between a patch and compared structures.

Note: A patch can be defined either as a set of resides of the protein, or as a ligand bound to the protein. LigandIdentification is an ancillary tool to facilitate ligand identification in a PDB coordinate file.

Access the service through the RPBS Mobyle portal:

When using this service, please cite the following reference:

[1] Rasolohery I, Moroy G, Guyon F.
PatchSearch: A Fast Computational Method for Off-Target Detection.
J Chem Inf Model. 2017 Apr 24;57(4):769-777.

PatchSearch Usage

Note: Selecting Yes to test the service with preset data will fill the input fields with the 4FJP PDB entry, residue number 711:A and the list of proteins to mine: 2I5F, 4FM5, 2P39, 1X9D, 4COX, 2CE2, 3N8X, 2IUW, 1PXX, 6COX, 2FMA, 3QMO, 3Q7D, 3N8W, 2OYE, 4FJP, 1Y93, 2PLZ, 3NTB, 2BRF, 2I53, 2B69. The results of the search can be accessed here.

Input

The Web server requires three inputs:

  1. Protein structure: the PDB file from which to extract the patch (either a protein-ligand complex or a protein without ligand).

  2. Patch definition:
    • If a complex is input: the chain identifier and the residue sequence number of a ligand. To know the chain identifier and the residue sequence number, the user can use the LigandIdentification service. The atoms at the protein surface and at less than 5.0A of the ligand form the query patch.
    • or

    • A alternative mean to define a patch is to give a list of residues forming a site of interest. The atoms at the protein surface of these residues form the query patch. The residues MUST be specified using the following format: "residue numer" + ":" + "chain identifier". Multiple residues can be separated by commas (e.g. "664:A, 671:A, 680:A"), or specified on different lines.

  3. Collection of proteins to mine:
    • A list of PDB identifiers.
    • or

    • Users can explicitely upload a protein structure (PDB format).

Results

  • Progress report

    This section will incrementally provide information about job progression and errors if any. A typical run should produce a report similar to the following :

    Progress Report

    Errors related to the input data specified are also reported in this field.

  • Search results

    PatchSearch results are an alignment between the query atoms and the atoms in the targeted protein surface and a similarity score, called PatchSearch Score (relative Binet-Cauchy score). The relative Binet-Cauchy score is based on the BC-score [2] weighted by the proportion of the number of retrieved atoms to the number of query atoms.

    Similar patches are output to a table with the PDB identifier, the number of patch atoms, alignment length (the number of retrieved atoms), the RMSD (between the query patch and the retrieved patch) and the PatchSearch Score.

    Viewer
  • Visualization of the best hits

    An interactive page allowing to browse the retrieved patches. The residues forming the patch are colored in red. The user can visualize the 20 best solutions according to the PatchSearch score.

    Interface
  • Downloadable results

    All the alignment results are downloadable. The file contains alignment scores and pairs of matched atoms.

Example, sample test

In this example, we aim to detect proteins which may interact with Naproxen. We used the structure of bovine lactoferrin complexed with Naproxen as query patch against a list of 20 proteins structures.

As a simple test, you can:

  1. Use Mobyle facilities: fetch a protein structure from the PDB databank using a PDB identifier.

    1. Click the db button.

    2. Select pdb in the menu.

    3. Enter a PDB identifier: 4FJP.

    4. Press select button.

    Interface
  2. Define the patch:

    1. Enter the residue number: "711"

    2. Enter the chain identifier: "A"

    Interface
  3. Define the targeted proteins:

    Input a list of targeted proteins into the field on the left: "4fjp, 3q7d, 4cox, 3ntb, 6cox, 3n8w, 1pxx, 3n8x, 4fm5, 3qmo, 2oye, 1y93, 2iuw, 2i5f, 2b69, 1x9d, 2fma, 2brf, 2p39, 2plz, 2ce2, 2i53"

    Interface
  4. Run:

    Launch PatchSearch by clicking Run at the top of the page.

  5. Results:

    The results of the search can be accessed here.

LigandIdentification Usage

In order to extract patch for PatchSearch computations, the residue number and the residue chain is required. The LigandIdentification service helps to know these pieces of information from a PDB file. All the patches interacting with ligands are extracted and can be visualized in graphical window.

Input

The Web server needs a PDB file.

Results

  • Progress report

    This section will incrementally provide information about job progression and errors if any. A typical run should produce a report similar to the following :

    Progress Report
  • Search results

    Each patch at less than 5.0A from a ligand can be visualized. The patch is colored in red and the ligand is in yellow sticks representation. The residue chain and the residue number for each ligand are displayed beside the graphical window.

    Viewer

Example

In this example, we identify the ligands and the associated patches in the 4FJP PDB file.

As a simple test, you can:

  1. Use Mobyle facilities: fetch a protein structure from the PDB databank using a PDB identifier.

    1. Click the db button.

    2. Select pdb in the menu.

    3. Enter a PDB identifier: 4FJP.

    4. Press select button.

    Interface
  2. Run:

    Launch LigandIdentification by clicking Run at the top of the page.

  3. All ligands with their residue chains and their residue numbers are displayed beside the graphical window.

    Interface

References

[1] Rasolohery I, Moroy G, Guyon F.
PatchSearch: A Fast Computational Method for Off-Target Detection.
J Chem Inf Model. 2017 Apr 24;57(4):769-777.
[2] Guyon F, Tufféry P.
Fast protein fragment similarity scoring using a Binet-Cauchy kernel.
Bioinformatics. 2014 Mar 15;30(6):784-91.