SaGePhy (short for “Simulation framework for Subgene and Gene Phylogenies) is an easy-to-use, open-source, and platform independent software package for simulating gene family evolution within species trees as well as subgene or protein-domain evolution within one or more gene trees. SaGePhy can generate species trees using a probabilistic birth-death process, generate gene trees within a given species tree using a probabilistic model of gene evolution that allows for gene duplications, horizontal gene transfers, and gene losses, and generate subgene or domain family phylogenies inside one or more gene trees by allowing for subgene duplications, horizontal subgene transfers within and across gene families (and either within or across species boundaries), and subgene losses. SaGePhy implements a range of important features not found in other software packages for phylogeney simulation; these include the ability to simulate (i) subgene or domain level events inside one or more gene families, (ii) both additive and replacing horizontal gene and subgene/domain transfers, (iii) distance-biased horizontal transfers, and (iv) gene family and subgene/domain family birth at randomly sampled nodes of the species tree or gene tree(s). SaGePhy therefore makes it possible to perform more realistic simulations of gene, domain, and subgene evolution. The SaGePhy software package also includes a supplemental tool to create simulated gene family sequences affected by subgene or domain level evolutionary events. Further details on the functionality of SaGePhy appear in the manual.
SaGePhy is built upon the open-source GenPhyloData framework. It is available open source under the New BSD licence.
- Executable: sagephy.zip (platform independent Java executable)
- Manual: sagephy-manual.pdf
- Source code: https://github.com/soumyakundu/SaGePhy (available open source under the New BSD licence)
SaGePhy can be cited as follows:
- SaGePhy: An improved phylogenetic simulation framework for gene and subgene evolution
Soumya Kundu and Mukul S. Bansal
Bioinformatics; in press.
Funding: Development of the software resource(s) available from this webpage was funded in part by U.S. National Science Foundation awards IIS 1553421, MCB 1616514, and IES 1615573.