Integrating 3D structure information to proteomics data

Appendix: graph visualization

Details on the features implemented for the Proteo3Dnet graph viewer are presented in this section.

  • Color code

    There are three types of nodes : those who correspond to input proteins are blue ; those added by the structure-based analysis are gray ; and those added by the BioGRID analysis are green. When using the “Fold change” (FC) advanced option, input nodes with a negative log(FC) are red while the others remain blue.

    The Proteo3Dnet pipeline represents interactions between proteins by are 4 types of edges:
    (i) Edges produced by the structure-based analysis and connecting input proteins. Those are thick, partially opaque and can be of five colors (blue, green, yellow, red, black) depending on the evolutionary distance (seq id ≥95%, 80%, 50%, 30%, 0%, respectively) between the two input proteins and their homologs that are found interacting within an experimental structure of the PDB.
    (ii) Edges produced by the structure-based analysis and connecting input proteins and undetected partners. Those are thin and black.
    (iii) Edges produced by the ELM/BioGRID analysis and connecting input proteins. Those are thin and gray.
    (iv) Edges produced by the ELM/BioGRID analysis and connecting input proteins and additional partners from BioGRID. Those are thin and green.
    When selected, nodes and edges become magenta (except structure-based edges, which keep their color).

    Graph viewer
  • Selection

    Users can select nodes and edges, by directly left-clicking on them.
    Holding SHIFT key while clicking allows to select more than a single item, either by (i) successive left clicks on multiple items, or (ii) by dragging a selection area.
    To unselect all items, simply click in a void area of the viewer.

    The use of the right click is described below.
    The center clicking has no effect.

    Viewer inputs
    Users can also interact with the graph representation, thanks to different inputs that we have implemented in the viewer.

    Numbers refer to the screenshot above

    #1: Typing here one or several names of proteins (e.g. SIAH1, TERF2) and then clicking on the “Select” button will select the corresponding nodes. This is helpful to find a particular protein when the nodes are numerous.

    #2: Clicking one the elements of this list will select the complexes (named “c001”, “c002” etc.) identified by Proteo3Dnet. Clicking on “All” will select all proteins that have been found in 3D complexes. This is helpful to see at a glance the proportion of input proteins that are covered by the structure-based analysis.

    #3, #4, #5: Checking one or more of these three boxes will select proteins, depending on whether they come from the input dataset, or they have been added in the graph representation, by the structure-based analysis, or the ELM/BioGRID analysis. This is helpful to identify proteins submitted by the user, especially when they are numerous.

    Once an item is selected, the selection can be extended.

    #6: “Neighbors” are nodes that are directly connected to the selected node(s). Multiple consecutive clicks on this button will select nodes with increasing degrees of separation. This is helpful to observe indirect interactions.

    #7: The “Invert” button allows to simultaneously (i) select the unselected nodes and (ii) unselect the selected nodes. This is mainly helpful in combination with the “Action” buttons presented below.

    #8: For a selected edge, the two connected nodes can be selected as well, by clicking on the “Connected nodes” button. This is helpful for visually delineating PPIs within dense graphs.

    These functions are not mutually exclusive: users can combine them for creating custom selections.

    The properties of a current selection of nodes are shown into two windows.

    #9: Here, the number and names of the selected proteins are displayed, separated by semi-colons. This text area can be expended, and its content copy/pasted. When thus saved, a custom list of nodes can be later selected by using the input #1.

    #10: If a node belongs to one or several complexes identified by the structural analysis, its selection will trigger the display of the complex names in this window. This is useful to observe the overlap between local PPI networks.

  • Action

    Several modifications can be applied to the selected items.

    #11: The “Focus” button will increase or decrease the zoom, so that the whole selection is visible within the viewer. This is particularly helpful, in combination with input #1, to browse through large graphs.

    Note: the zoom can be otherwise modified, either by scrolling or by using the cursor at the top left of the viewer.

    #12, #13, #14: Selected items can be hidden, in order to ease the visualization of dense graphs. When a connected node is hidden, its partner(s) will display a black border. Such black-bordered nodes can be selected, and by clicking on the “Show” button, have their hidden connections revealed back. Combined with the aforementioned “Invert” button, hiding nodes is a way to highlight specific nodes in a graph. The other way is presented below. Successive hide/show manipulations can alter the representation in a seemingly irreversible manner. To restore the initial display, users can click on the “Reset” button.

    #15, #16: Highlighting is enabled by clicking on the activate button. All the non-selected items will have their opacity decreased. However, unlike with the Invert+Hide combination of buttons, the non-highlighted nodes and edges are still selectable. This highlighting can be removed with the “Cancel”.

  • Right click

    Nodes and edges (whether they are selected or not) can be right-clicked. This will trigger a tooltip, the content of which varies depending on the type of item.

    The tooltip appearing on every right-clicked node contains two hyperlinks: one redirects (in a new thumbnail) to the corresponding UniProt page; the other opens the MolArt protein viewer, which allows to visualize the 3D structure of the protein, as well as some linear annotations.

    Right-clicking on edges established by the structural analysis will give you the PDB IDs of the protein structures that have enable to infer this interaction. Each PDB comes with a sequence identity (in %) which measures the evolutionary distance between the two input proteins and the two PDB chains found for these proteins (with our procedure for detecting distant homologies). The structural analysis also establishes edges with additional nodes (labeled “undetected”). In this case, the list of PDB IDs is displayed without the sequence identity.
    The second type of edges are those found with the ELM/BioGRID analysis between input nodes. In this case, a right-click on the edge will display hyperlinks for the ELM motif found on one node and the Pfam domain found on the other.
    Finally, edges between input protein and additional BioGRID partners will simply display a link to the BioGRID database.