format_utils#

Shared format structure (NEXUS, PHYLIP). Class-specific (de)serialization lives in core/*/io.

Nexus#

Shared NEXUS format structure.

Provides parse_nexus (labels + all blocks), write helpers (header, TAXA block, block). Class-specific parsing of block content stays in core/distance/io, core/sequence/io, core/split/io.

phylozoo.utils.io.format_utils.nexus.nexus_header() str[source]#

Return the NEXUS file header (#NEXUS and blank line).

Returns:

The NEXUS file header.

Return type:

str

phylozoo.utils.io.format_utils.nexus.parse_nexus(nexus_string: str) tuple[list[str], dict[str, str]][source]#

Parse a NEXUS string into labels (from TAXA block) and data blocks.

Parameters:

nexus_string (str) – Full NEXUS file content.

Returns:

(labels, blocks). labels from TAXA block; blocks maps canonical block name (DISTANCES, CHARACTERS, SPLITS) to full block content (including trailing “;”).

Return type:

tuple[list[str], dict[str, str]]

Raises:

PhyloZooParseError – If no TAXA block with TAXLABELS is found.

Notes

A file may contain multiple data blocks (e.g. DISTANCES and SPLITS). Callers use only the block they need (e.g. DistanceMatrix uses “DISTANCES”).

phylozoo.utils.io.format_utils.nexus.write_block(block_name: str, body: str) str[source]#

Build a NEXUS data block (BEGIN name; body END;).

Parameters:
  • block_name (str) – Block name (e.g. DISTANCES, CHARACTERS, SPLITS).

  • body (str) – Block body (commands and MATRIX etc.); need not include trailing “;”.

Returns:

Full block including BEGIN/END.

Return type:

str

phylozoo.utils.io.format_utils.nexus.write_taxa_block(labels: list[str]) str[source]#

Build the TAXA block string (BEGIN TAXA; DIMENSIONS ntax=N; TAXLABELS … ; END;).

Parameters:

labels (list[str]) – Taxon labels.

Returns:

Full TAXA block including BEGIN/END.

Return type:

str

Phylip#

Shared PHYLIP matrix format structure.

Generic layout: first line = n, then n lines of (label, rest). Class-specific parsing (e.g. float matrix, symmetry) stays in core/distance/io.

phylozoo.utils.io.format_utils.phylip.parse_phylip_matrix(phylip_string: str) tuple[int, list[tuple[str, str]]][source]#

Parse PHYLIP matrix layout into n and rows (label, rest of line).

Parameters:

phylip_string (str) – Full PHYLIP file content.

Returns:

(n, rows). n = number of taxa; rows = list of (label, rest) per line. Label is first 10 chars or until whitespace; rest is the remainder.

Return type:

tuple[int, list[tuple[str, str]]]

Raises:

PhyloZooParseError – If string is empty or first line is not an integer.

phylozoo.utils.io.format_utils.phylip.write_phylip_matrix(n: int, rows: list[tuple[str, str]]) str[source]#

Build PHYLIP matrix string (first line n, then n lines of label + rest).

Parameters:
  • n (int) – Number of taxa.

  • rows (list[tuple[str, str]]) – Each element is (label, rest) for one row. Label is padded to 10 chars.

Returns:

Full PHYLIP matrix content (no trailing newline required by callers).

Return type:

str