Softwares

HHalign-Kbest is useful to automatically obtain optimized alignments and models in case of low sequence identity (<35%) between a query and a template protein. It can generate k suboptimal (e.g. top-k scoring) alignments rather than only the optimal one which may contain small to large errors.

Yu J, Picord G, Tufféry P, Guerois R.
HHalign.KBest: exploring sub-optimal alignments for remote homology comparative modeling
submitted.
Download

iSuperpose performs the 3D superposition of protein structures by best superimposing the alpha-carbons (or the backbone) of the proteins given a alignment specifying the correspondence between the structures. If no alignment is provided, a structural alignment will be calculated using TMalign. One the alignement is identified, the superposition is achieved using a quaternion based procedure using a specific eigen value calculation implementation. See QBestFit.

Maupetit J, Tufféry P.
Download

XmMol is a desktop macromolecular visualization and modeling tool designed to be easy to use, configure and enhance. Its graphics are based on X11, and part of its user interface is based on Motif. Thus it provides a way of displaying structures on any X11 server. Its main features are:

  • interactive graphics of macromolecules on any X11 display. Drawings are performed as wireframes to preserve interactivity. Space filling static images can be obtained by using an interface to external rendering programs such as MolScript and Raster3D.
  • strong ability to be interfaced with external programs. A communication protocol allows XmMol to fork external programs called "delegates" and exchange information. This feature allows the easy implementation of new features of XmMol. This offers possibilities for automatic script execution, new file format I/O implementation, file coordinate modification, implementing external renderers, .... Thus, XmMol can also be used as a graphic debugger for numerical methods applied to molecules (minimizers, ...). Examples of how to implement molecular superimposition, dynamic trajectory animation, as well as calls to external standard programs such as babel or hbplus are provided.
  • Some modelling tools are supported, such as docking facilities, interactive backbone deformation (part of the Forme package). However, the aim of XmMol is mostly to give each user the opportunity to interface its own methods.
Tufféry P.
XmMol: an X11 and motif program for macromolecular visualization and modeling.
J Mol Graph. 1995 Feb;13(1):67-72, 62.
Download

As our understanding of the mechanisms underlying evolution becomes more accurate, and the amount of protein data increases, the investigation of more and more sophisticated hypotheses becomes tractable. Also, the analysis of particular features associated with particular families of proteins becomes our concern.

Simulated protein sequences can provide an expectation under a null hypothesis against which real data can be compared. Different programs have been designed for such simulations, mostly with the aim of allowing the test of generic features such as the efficiency of phylogeny reconstruction methods or evaluating competing phylogenetic hypotheses. To investigate hypotheses related to the evolution of particular protein families, it becomes necessary that simulations take into account as much information as possible that can be inferred from a particular phylogenetic reconstruction.

CS-PSeq-Gen is a program derived from PSeq-Gen, a program developed by Nick C. Grassly and Andrew Rambaut, designed to simulate the evolution of protein sequences along evolutionary trees. CS-PSeq-Gen modifications are related to the aim of simulating the evolution of protein sequences under the constraints of the information of a particular reconstructed phylogeny: the "root sequence" that initiates the simulation, or the rate heterogeneity among sites are specific on each particular protein family. CS-Pseq-Gen will allow simulations to take such information into account. As well, exploring the evolution of one protein family and testing hypotheses makes often it necessary to have some control on the variability of the parameters. CS-PSeq-Gen will allow some control on the simulated tree / branch lengths around an average value. Finally, a particular category of applications for such simulations is the search for the significant co-evolution of sites. CS-PSeq-Gen offers some facilities to generate sequences under such hypotheses, and propose a basic scheme for their detection, that can be easily adapted by programmers.

This program may be used and distributed freely but only as the original compressed archive file. The author is grateful for any comments, suggestions or bug reports.

Tufféry P.
CS-PSeq-Gen: simulating the evolution of protein sequence under constraints.
Bioinformatics. 2002 Jul;18(7):1015-6.
Download

Some Python classes.

Fasta Python classes to manage single and multi fasta sequences. Download
Mol2 Python classes to manage simple and multiple mol2 files. Download
PDBpy A python parser for PDB files. Download
BioMoby-python API Python for BioMoby. Download
BCscore A score based on Binet-Cauchy kernel. Allows for the search of both similar and mirror conformations. Addresses two major issue of the widely used root mean square deviation (RMSD):
  • Achieves length independent statistics even for short fragments,
  • Shows better performance in the discrimination of medium range RMSD values,
  • Provides the means for large-scale mining of protein structures.
Download
QBestFit 3D superposition module, in C. Uses the quaternions. Very reliable. Download

The two libraries described here were built from an analysis of crystallographic protein structures from the Protein Data Bank (PDB). Only proteins at least 30 amino acids long, having no chain breaks, obtained by X-ray diffraction with a resolution better than 2.5 Angstroms were retained. In date of november 2003, this resulted in a collection of 2926 protein chains presenting less than 30% sequence identity (489528 amino acids, not considering ALA, GLY and PRO). The libraries store the chi values, and the frequencies of occurrence of each conformation. Compared to our previous libraries, we use a different approach to determine rotameric values, since we use Gaussian filters, in a way similar to that described by Lovell et al., Proteins 2000 40:389-408.

Backbone independent rotamer library Library built not considering backbone conformation. Download
Backbone dependent rotamer library Result of an analysis of side chain conformations in the light of the conformation of the backbone of the 4 residue length peptidic fragment englobing it, caracterized by its structural alphabet (SA) state. Download

TEF is an open-source software for decomposing protein structures into simpler yet informative units named Tightened End Fragments (or closed loops), which can be studied independently to understand protein architecture, folding, and evolution.

Stratmann D, Pathmanathan JS, Postic G, Rey J, Chomilier J.
TEF2.0: a graph-based method for decomposing protein structures into closed loops.
Submitted.
Lamarine M, Mornon JP, Berezovsky N, Chomilier J.
Distribution of tightened end fragments of globular proteins statistically matches that of topohydrophobic positions: towards an efficient punctuation of protein folding?
Cell Mol Life Sci. 2001 Mar;58(3):492-8.

The Structural Bioinformatics Library (SBL) is a generic C++ / Python library, providing models and algorithms for structural bioinformatics, so as to foster our understanding of the relationship between the structure and the function of macro-molecules and their complexes.

Its design accommodates various molecular geometric models coding the physical and chemical properties of macro-molecular structures, and the variety of operations undertaken on these models.

Software components of the SBL are organized into four categories:

  • Applications : packages for end-users, providing programs targetting specific biophysical problems.
  • Core : low-level generic C++ classes for algorithms and data structures.
  • Models : C++ models matching the C++ concepts required to instantiate classes from Core.
  • Modules : components used to define the workflow of applications. Modules are C++ classes parameterized by classes from the Core and Models.

The SBL runs on most Unix systems (Linux and MacOS), and also on Windows.

Cazals F, Dreyfus T.
The structural bioinformatics library: modeling in biomolecular science and beyond.
Bioinformatics. 2017 Apr 1;33(7):997-1004.