CS-PSeq-Gen: Simulation of protein sequences under constraints
version 1.0

P. Tuffery,
Universite Paris 7
2 place Jussieu
75251 Paris cedex 05, France

What is CS-PSeq-Gen?

As our understanding of the mechanisms underlying evolution becomes more accurate, and the amount of protein data increases, the investigation of more and more sophisticated hypotheses becomes tractable. Also, the analysis of particular features associated with particular families of proteins becomes our concern.
Simulated protein sequences can provide an expectation under a null hypothesis against which real data can be compared. Different programs have been designed for such simulations, mostly with the aim of allowing the test of generic features such as the efficiency of phylogeny reconstruction methods or evaluating competing phylogenetic hypotheses. To investigate hypotheses related to the evolution of particular protein families, it becomes necessary that simulations take into account as much information as possible that can be inferred from a particular phylogenetic reconstruction.

CS-PSeq-Gen is a program derived from PSeq-Gen, a program developed by Nick C. Grassly and Andrew Rambaut, designed to simulate the evolution of protein sequences along evolutionary trees. CS-PSeq-Gen modifications are related to the aim of simulating the evolution of protein sequences under the constraints of the information of a particular reconstructed phylogeny: the "root sequence" that initiates the simulation, or the rate heterogeneity among sites are specific on each particular protein family. CS-Pseq-Gen will allow simulations to take such information into account. As well, exploring the evolution of one protein family and testing hypotheses makes often it necessary to have some control on the variability of the parameters. CS-PSeq-Gen will allow some control on the simulated tree / branch lengths around an average value. Finally, a particular category of applications for such simulations is the search for the significant co-evolution of sites. CS-PSeq-Gen offers some facilities to generate sequences under such hypotheses, and propose a basic scheme for their detection, that can be easily adapted by programmers.


This program may be used and distributed freely but only as the original compressed archive file. The author is grateful for any comments, suggestions or bug reports.

Source code is available here.