Input/Output
Classes for reading and writing sequence alignments and phylogenetic trees.
Sequence Alignment
Alignment
- class crabml.io.sequences.Alignment(names, sequences, n_species, n_sites, seqtype)[source]
Multiple sequence alignment.
Attributes
- nameslist[str]
Sequence names/labels
- sequencesndarray, shape (n_species, n_sites)
Encoded sequences as integer arrays
- n_speciesint
Number of sequences
- n_sitesint
Number of sites (alignment length)
- seqtypestr
Sequence type (‘codon’, ‘aa’, ‘dna’)
- classmethod from_phylip(filepath, seqtype='codon')[source]
Parse PHYLIP format alignment file.
Custom parser for PAML-style PHYLIP format (sequential). The first line contains n_sequences and sequence_length. Each sequence starts with a name line, followed by sequence data.
- Return type:
Parameters
- filepathPath or str
Path to PHYLIP format file
- seqtypestr
Sequence type: ‘codon’, ‘aa’, or ‘dna’
Returns
- Alignment
Parsed alignment
Examples
>>> aln = Alignment.from_phylip("lysozyme.txt", seqtype='codon') >>> aln.n_species 7 >>> aln.n_sites 130
- classmethod from_fasta(filepath, seqtype='codon')[source]
Parse FASTA format alignment file.
- Return type:
Parameters
- filepathPath or str
Path to FASTA format file
- seqtypestr
Sequence type: ‘codon’, ‘aa’, or ‘dna’
Returns
- Alignment
Parsed alignment
Examples
>>> aln = Alignment.from_fasta("alignment.fasta", seqtype='codon')
- to_phylip(filepath)[source]
Write alignment to PHYLIP format file.
- Return type:
Parameters
- filepathPath or str
Output file path
- to_fasta(filepath)[source]
Write alignment to FASTA format file.
- Return type:
Parameters
- filepathPath or str
Output file path
- __init__(names, sequences, n_species, n_sites, seqtype)
Phylogenetic Trees
Tree
- class crabml.io.trees.Tree(root, n_nodes, n_leaves, leaf_names)[source]
Phylogenetic tree.
Attributes
- rootTreeNode
Root node of the tree
- n_nodesint
Total number of nodes
- n_leavesint
Number of leaf nodes
- leaf_nameslist[str]
Names of leaf nodes
- root: TreeNode
- classmethod from_newick(newick_string)[source]
Parse Newick format tree string.
- Return type:
Parameters
- newick_stringstr
Newick format tree
Returns
- Tree
Parsed tree
- postorder()[source]
Return nodes in post-order traversal (leaves to root).
- Return type:
list[TreeNode]
Returns
- list[TreeNode]
Nodes in post-order
- get_branches()[source]
Get all branches as (parent, child) pairs.
Returns
- list[tuple[TreeNode, TreeNode]]
List of (parent, child) tuples for each branch
- get_branch_labels()[source]
Get integer branch labels for branch-site models.
Converts string labels like ‘#0’, ‘#1’ to integers. Branches without labels are assigned 0 (background).
Returns
- list[int]
Branch labels as integers (0=background, 1=foreground, etc.)
- validate_branch_site_labels()[source]
Validate branch labels for branch-site models.
Branch-site models (Model A, A1) require exactly 2 label types: - 0 (background) - 1 (foreground)
- Return type:
Raises
- ValueError
If labels are not valid for branch-site models
- to_newick()[source]
Convert tree to Newick format string.
- Return type:
Returns
- str
Tree in Newick format
- __init__(root, n_nodes, n_leaves, leaf_names)