crabML Documentation

crabML is a high-performance reimplementation of PAML’s codeml for phylogenetic maximum likelihood analysis, powered by Rust.

Features

  • Command-Line Interface: Simple crabml command with five analysis modes

    • site-model: Site-class model tests (M1a vs M2a, M7 vs M8)

    • branch-model: Branch model tests (multi-ratio, free-ratio)

    • branch-site: Branch-site model tests

    • fit: Fit single models

    • simulate: Generate synthetic sequences under evolutionary models

  • Unified Python API: Simple functions for all model types with specialized result classes

  • Site-class models: M0, M1a, M2a, M3, M4, M5, M6, M7, M8, M8a, M9

  • Branch models: Free-ratio and multi-ratio models for lineage-specific selection

  • Branch-site models: Model A (test for positive selection on specific lineages)

  • Sequence simulation: Generate test data under M0, M1a, M2a, M7, M8 models

  • Hypothesis testing: Complete LRT framework for detecting positive selection

  • High-performance Rust backend: 300-500x faster than NumPy, 3-10x faster than PAML

  • PAML validation: All models produce exact numerical matches

Quick Start

Command-Line Interface:

# Site-class model tests (positive selection)
crabml site-model -s alignment.fasta -t tree.nwk --test both

# Branch model tests (lineage-specific selection)
crabml branch-model -s alignment.fasta -t labeled_tree.nwk --test multi-ratio

# Branch-site model test (site + lineage selection)
crabml branch-site -s alignment.fasta -t labeled_tree.nwk

# Fit a single model
crabml fit -m M0 -s alignment.fasta -t tree.nwk

# Simulate sequences for validation
crabml simulate m2a -t tree.nwk -o sim.fasta -l 1000 \
    --p0 0.5 --p1 0.3 --omega0 0.1 --omega2 2.5

Python API - Fit a single model:

from crabml import optimize_model

result = optimize_model("M0", "alignment.fasta", "tree.nwk")
print(result.summary())
print(f"omega = {result.omega:.4f}")

Python API - Test for positive selection:

from crabml import positive_selection

results = positive_selection("alignment.fasta", "tree.nwk", test="both")
print(results['M1a_vs_M2a'].summary())

Contents

Additional Resources

Indices and tables