MS2MODELS
An integrative proteomics pipeline for detecting biomolecular complexes
An integrative proteomics pipeline for detecting biomolecular complexes
Mass spectrometry (MS) has become essential for characterizing molecular species and their interactions. Most of the time, proteomic studies stop at listing the interacting proteins, without performing the analysis of the identified sequences. This is a wasted opportunity when considering the fact that structural and evolutionary aspects provide a powerful analysis framework for biologists: e.g. for interpreting patients mutations that interfere with assemblies, setting up directed mutagenesis and functional dissection experiments, or virtual screening.
The MS2MODELS proteomics pipeline integrates structural biology to MS data, in order to enhance the analysis of the protein-protein interaction networks. The homology-based detection of relevant structures from the Protein Data Bank (PDB) [1] is carried out with HHsearch [2]. Annotations of homomultimeric complexes, as well as interaction data from BioGRID [3] and the eukaryotic linear motifs (ELM) [4] resource are also integrated into the analysis.
Access the service through the RPBS Mobyle portal:
The web server requires one mandatory and three optional inputs:
This section will incrementally provide information about job progression and errors if any. A typical run should produce a report similar to the following :
Errors related to the input data specified are also reported in this field.
The results of the search for protein complexes is presented in four HTML tables. Table 1 shows the PDB entry found for each candidate protein, along with related information (resolution, organism...). Table 2 shows the results of the search for the homo-oligomeric state. Table 3 lists the detected complexes among the input list, along with the potentially undetected subunits. Finally, Table 4 shows the interactions identified by the integration of the ELM data.
The results produced by the pipeline can be visualized as a graph, thanks to Cytoscape.js. The nodes are of three types: input proteins, undetected partners, and BioGRID partners. The last result from the integration of the BioGRID data. Above the graph viewer are buttons which allow to select nodes and edges depending on their attributes.
Clicking on any node of the graph activates a molecular viewer, MolArt [6], which displays the 3D structure of the interaction partner. Thus, the protein chain can be visualized in the context of the oligomeric assembly. The MolArt viewer also provides various annotations of the protein sequence, depending on the data available.
[1] Rose, P. W., Prlić, A., Altunkaya, A., Bi, C., Bradley, A. R., Christie, C. H., ... & Green, R. K.
The RCSB protein data bank: integrative view of protein, gene and 3D structural information.
Nucleic acids research 2016; gkw1000.
[2] Söding, J.
Protein homology detection by HMM-HMM comparison.
Bioinformatics 2004; 21(7), 951-960.
[3] Oughtred, R., Stark, C., Breitkreutz, B. J., Rust, J., Boucher, L., Chang, C., ... & Zhang, F.
The BioGRID interaction database: 2019 update.
Nucleic acids research 2019; 47(D1), D529-D541.
[4] Gouw, M., Michael, S., Sámano-Sánchez, H., Kumar, M., Zeke, A., Lang, B., ... & Diella, F.
The eukaryotic linear motif resource - 2018 update.
Nucleic acids research 2017; 46(D1), D428-D434.
[5] Franz, M., Lopes, C. T., Huck, G., Dong, Y., Sumer, O., & Bader, G. D.
Cytoscape. js: a graph theory library for visualisation and analysis.
Bioinformatics 2015; 32(2), 309-311.
[6] Hoksza, D., Gawron, P., Ostaszewski, M., & Schneider, R.
MolArt: a molecular structure annotation and visualization tool.
Bioinformatics 2018; 34(23), 4127-4128.