format_utils#
Shared format structure (NEXUS, PHYLIP). Class-specific (de)serialization lives in core/*/io.
Nexus#
Shared NEXUS format structure.
Provides parse_nexus (labels + all blocks), write helpers (header, TAXA block, block). Class-specific parsing of block content stays in core/distance/io, core/sequence/io, core/split/io.
- phylozoo.utils.io.format_utils.nexus.nexus_header() str[source]#
Return the NEXUS file header (#NEXUS and blank line).
- Returns:
The NEXUS file header.
- Return type:
- phylozoo.utils.io.format_utils.nexus.parse_nexus(nexus_string: str) tuple[list[str], dict[str, str]][source]#
Parse a NEXUS string into labels (from TAXA block) and data blocks.
- Parameters:
nexus_string (str) – Full NEXUS file content.
- Returns:
(labels, blocks). labels from TAXA block; blocks maps canonical block name (DISTANCES, CHARACTERS, SPLITS) to full block content (including trailing “;”).
- Return type:
- Raises:
PhyloZooParseError – If no TAXA block with TAXLABELS is found.
Notes
A file may contain multiple data blocks (e.g. DISTANCES and SPLITS). Callers use only the block they need (e.g. DistanceMatrix uses “DISTANCES”).
Phylip#
Shared PHYLIP matrix format structure.
Generic layout: first line = n, then n lines of (label, rest). Class-specific parsing (e.g. float matrix, symmetry) stays in core/distance/io.
- phylozoo.utils.io.format_utils.phylip.parse_phylip_matrix(phylip_string: str) tuple[int, list[tuple[str, str]]][source]#
Parse PHYLIP matrix layout into n and rows (label, rest of line).
- Parameters:
phylip_string (str) – Full PHYLIP file content.
- Returns:
(n, rows). n = number of taxa; rows = list of (label, rest) per line. Label is first 10 chars or until whitespace; rest is the remainder.
- Return type:
- Raises:
PhyloZooParseError – If string is empty or first line is not an integer.