ModelPhy is a language for interchange of phylogenetic models between different software packages. It provides a clear, concise syntax for defining probabilistic models of sequence evolution, tree priors, and observed data.
Phylogenetic analyses often require setting up complex models that include multiple components:
- Substitution models (e.g., JC69, HKY, GTR)
- Rate heterogeneity models (e.g., Gamma, invariant sites)
- Tree priors (e.g., Yule, Birth-Death)
- Clock models (e.g., strict clock, relaxed clock)
Currently, each software package (MrBayes, BEAST, RevBayes, etc.) has its own syntax for specifying these models, making it difficult to share models between researchers and platforms. ModelPhy aims to solve this problem by providing a standard language that can be translated to and from different software packages.
- Human-readable syntax for defining phylogenetic models
- Mathematical types corresponding to phylogenetic concepts
- Stochastic assignments for random variables
- Deterministic assignments for derived quantities
- Support for common probability distributions
- Observations for tying models to data
// Define transition transversion ratio prior
Real kappa ~ LogNormal(meanlog=1.0, sdlog=0.5);
// Define nucleotide frequency prior
Simplex pi ~ Dirichlet(alpha=[1.0, 1.0, 1.0, 1.0]);
// Create HKY substitution model
QMatrix subst_model = HKY(kappa=kappa, baseFrequencies=pi);
// Define birth rate and create Yule tree prior
Real birth_rate ~ Exponential(mean=0.1);
TimeTree phylogeny ~ Yule(birthrate=birth_rate, n=3);
// Create phylogenetic CTMC model
Alignment seq ~ PhyloCTMC(tree=phylogeny, Q=subst_model);
// Attach observed sequence data
seq observe [
human = Sequence(str="ACGTACGTACGTACGTACGTACGT"),
chimp = Sequence(str="ACGTACGTACGTACGTATGTACGT"),
gorilla = Sequence(str="ACGTACGTACGCACGTACGTACGT")
];
/spec: Language specification and documentation/grammar: ANTLR grammar files/java: Java implementation of ModelPhy parser/cpp: C++ implementation of ModelPhy parser/examples: Example ModelPhy files/converters: Converters to/from other formats (BEAST XML, RevBayes scripts, etc.)
- Java 11 or higher
- ANTLR 4.9 or higher
- CMake 3.10 or higher (for C++ implementation)
# Clone the repository
git clone https://github.com/yourusername/modelphy.git
cd modelphy
# Build Java implementation
cd java
./gradlew build
# Build C++ implementation
cd ../cpp
mkdir build && cd build
cmake ..
make# Parse and validate a ModelPhy file
java -jar modelphy.jar validate example.mph
# Convert ModelPhy to BEAST XML
java -jar modelphy.jar convert --to beast example.mph > example.xml
# Convert ModelPhy to RevBayes script
java -jar modelphy.jar convert --to revbayes example.mph > example.revThe full language specification is available in SPECIFICATION.md.
- Complete ANTLR grammar
- Java reference implementation
- C++ implementation
- Converter to/from BEAST XML
- Converter to/from RevBayes
- Support for more complex models (partitioned analyses, relaxed clocks)
- Web-based model builder and validator
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
- This project was inspired by the need for better interoperability between phylogenetic software packages
- Thanks to the developers of BEAST, MrBayes, RevBayes, and other phylogenetic software for their pioneering work