split#
Split module.
This module provides classes for working with phylogenetic splits and split systems. A split is a bipartition of a set of taxa, representing a division of the taxa into two non-empty subsets. Split systems are collections of splits that can represent phylogenetic trees or networks. The public API (Split, SplitSystem, WeightedSplitSystem, to_weightedsplitsystem, and the algorithms, classifications, and io submodules) is re-exported here; the implementation is split across the base, splitsystem, weighted_splitsystem, algorithms, classifications, and io submodules.
Main Classes#
Splits module.
This module provides classes for working with phylogenetic splits. A split is a 2-partition {A, B} of a set of elements where A ∪ B equals the full set and A ∩ B = ∅.
- class phylozoo.core.split.base.Split(set1: set[T], set2: set[T])[source]#
Bases:
PartitionClass for 2-partitions of sets, child-class of the general Partition class.
A split is a 2-partition of a set of elements. It takes as input two sets of elements that form the split.
- Parameters:
- Raises:
PhyloZooValueError – If the sets overlap (i.e., the split is invalid).
Examples
>>> split = Split({1, 2}, {3, 4}) >>> split.is_trivial() False >>> split.elements {1, 2, 3, 4} >>> split2 = Split({1}, {2, 3, 4}) >>> split2.is_trivial() True
- elements#
Set containing all elements from both sides of the split (inherited from Partition).
- Type:
Split systems module.
This module provides classes for working with split systems. A split system is a collection of splits where each split covers the complete set of elements. Weighted split systems assign positive weights to each split.
- class phylozoo.core.split.splitsystem.SplitSystem(splits: set[Split] | list[Split] | None = None)[source]#
Bases:
IOMixinClass for a split system: set of full splits (complete partitions of elements).
A split system is a collection of splits where each split covers the complete set of elements. This class validates that all splits cover the same element set and provides methods for working with split systems.
- Parameters:
splits (set[Split] | list[Split], optional) – Set or list of splits. If a list is provided, it will be converted to a set to ensure uniqueness. By default None (empty set).
- Raises:
PhyloZooValueError – If not all splits cover the complete set of elements (i.e., not a set of full splits).
Notes
Supported I/O formats:
nexus(default):.nexus,.nex,.nxs
Examples
>>> split1 = Split({1, 2}, {3, 4}) >>> split2 = Split({1, 3}, {2, 4}) >>> system = SplitSystem([split1, split2]) >>> len(system) 2 >>> system.elements == {1, 2, 3, 4} True >>> split1 in system True
- __iter__() Iterator[Split][source]#
Return an iterator over the splits.
- Returns:
Iterator over splits.
- Return type:
Iterator[Split]
- __len__() int[source]#
Return the number of splits in the system.
- Returns:
Number of splits.
- Return type:
- __repr__() str[source]#
Return string representation of the split system.
- Returns:
String representation.
- Return type:
- __setattr__(name: str, value: any) None[source]#
Prevent modification of attributes after initialization.
- Raises:
AttributeError – If attempting to modify any attribute after initialization.
- __str__() str[source]#
Return human-readable string representation of the split system.
Displays the split system showing all splits, one per line. No truncation is applied.
- Returns:
Human-readable string representation.
- Return type:
Examples
>>> split1 = Split({1, 2}, {3, 4}) >>> split2 = Split({1, 3}, {2, 4}) >>> system = SplitSystem([split1, split2]) >>> str(system) 'SplitSystem({\n Split(1 2 | 3 4),\n Split(1 3 | 2 4)\n})'
Weighted split systems module.
This module provides classes for working with weighted split systems. A weighted split system assigns positive weights to each split in the system.
- class phylozoo.core.split.weighted_splitsystem.WeightedSplitSystem(splits: set[Split] | list[Split] | dict[Split, float] | list[tuple[Split, float]] | None = None)[source]#
Bases:
SplitSystemClass for a weighted split system: set of full splits with positive weights.
A weighted split system is a function that maps each possible split on a set of elements to a weight. This implementation inherits from SplitSystem and only stores splits with positive weights. Zero-weight splits are not allowed.
- Parameters:
splits (set[Split] | list[Split] | dict[Split, float] | list[tuple[Split, float]] | None, optional) –
Input splits with weights. Can be:
A set or list of splits (each assigned weight 1.0)
A dictionary mapping splits to their weights
A list of (split, weight) tuples
By default None (empty system).
- Raises:
PhyloZooValueError – If not all splits cover the complete set of elements, if any weight is not positive (zero or negative), if duplicate splits are found, or if split elements don’t match system elements.
Notes
Supported I/O formats:
nexus(default):.nexus,.nex,.nxs
Examples
>>> # From list of splits (weight 1.0 each) >>> split1 = Split({1, 2}, {3, 4}) >>> split2 = Split({1, 3}, {2, 4}) >>> system = WeightedSplitSystem([split1, split2]) >>> system.get_weight(split1) 1.0 >>> system.get_weight(split2) 1.0
>>> # From dictionary with weights >>> weights = {split1: 2.5, split2: 1.0} >>> system = WeightedSplitSystem(weights) >>> system.get_weight(split1) 2.5 >>> system.total_weight 3.5
>>> # From list of tuples >>> system = WeightedSplitSystem([(split1, 0.8), (split2, 0.2)]) >>> system.get_weight(split1) 0.8
- __repr__() str[source]#
Return string representation of the weighted split system.
- Returns:
String representation that can be used to recreate the object.
- Return type:
- __str__() str[source]#
Return human-readable string representation of the weighted split system.
Displays the weighted split system showing all splits with their weights, one per line. No truncation is applied.
- Returns:
Human-readable string representation.
- Return type:
Examples
>>> split1 = Split({1, 2}, {3, 4}) >>> split2 = Split({1, 3}, {2, 4}) >>> system = WeightedSplitSystem({split1: 2.5, split2: 1.0}) >>> str(system) 'WeightedSplitSystem({\n Split(1 2 | 3 4): 2.5,\n Split(1 3 | 2 4): 1.0\n})'
- get_weight(split: Split) float[source]#
Get the weight of a split.
Returns 0.0 if the split is not in the system (i.e., has no weight assigned).
- Parameters:
split (Split) – Split to get the weight for.
- Returns:
Weight of the split, or 0.0 if the split is not in the system.
- Return type:
- Raises:
PhyloZooValueError – If the split does not cover the same elements as the split system.
Examples
>>> split1 = Split({1, 2}, {3, 4}) >>> split2 = Split({1, 3}, {2, 4}) >>> system = WeightedSplitSystem({split1: 2.5, split2: 1.0}) >>> system.get_weight(split1) 2.5 >>> system.get_weight(Split({1, 4}, {2, 3})) # Split not in system 0.0
- phylozoo.core.split.weighted_splitsystem.to_weightedsplitsystem(system: SplitSystem, default_weight: float = 1.0) WeightedSplitSystem[source]#
Convert a SplitSystem to a WeightedSplitSystem.
Assigns the same weight (default_weight) to each split in the system.
- Parameters:
system (SplitSystem) – The split system to convert.
default_weight (float, optional) – The weight to assign to each split. Must be positive. By default 1.0.
- Returns:
A weighted split system with all splits having the specified weight.
- Return type:
- Raises:
PhyloZooValueError – If default_weight is not positive.
Examples
>>> split1 = Split({1, 2}, {3, 4}) >>> split2 = Split({1, 3}, {2, 4}) >>> system = SplitSystem([split1, split2]) >>> weighted = to_weightedsplitsystem(system, default_weight=2.0) >>> isinstance(weighted, WeightedSplitSystem) True >>> weighted.get_weight(split1) 2.0 >>> weighted.get_weight(split2) 2.0
Classification Functions#
Split system classifications module.
This module provides functions for classifying split systems, such as checking pairwise compatibility.
- phylozoo.core.split.classifications.has_all_trivial_splits(system: SplitSystem) bool[source]#
Check if the split system contains all trivial splits.
For a split system with n elements, there should be n trivial splits, where each trivial split has one element in one set and all other n-1 elements in the other set.
- Parameters:
system (SplitSystem) – The split system to check.
- Returns:
True if all trivial splits are present, False otherwise.
- Return type:
Examples
>>> from phylozoo.core.split import Split, SplitSystem >>> # System with 3 elements should have 3 trivial splits >>> split1 = Split({1}, {2, 3}) >>> split2 = Split({2}, {1, 3}) >>> split3 = Split({3}, {1, 2}) >>> system = SplitSystem([split1, split2, split3]) >>> has_all_trivial_splits(system) True >>> # Missing one trivial split >>> system2 = SplitSystem([split1, split2]) >>> has_all_trivial_splits(system2) False
- phylozoo.core.split.classifications.is_compatible(split1: Split, split2: Split) bool[source]#
Check if two splits are compatible.
Two splits are compatible if they have the same set of elements, and one of the sets of one split is a subset of one of the sets of the other split (and hence the other set is a superset).
- Parameters:
- Returns:
True if the splits are compatible, False otherwise.
- Return type:
- Raises:
PhyloZooValueError – If either argument is not a Split instance.
Examples
>>> split1 = Split({1, 2}, {3, 4}) >>> split2 = Split({1}, {2, 3, 4}) >>> is_compatible(split1, split2) True >>> split3 = Split({1, 2, 3}, {4}) >>> is_compatible(split1, split3) True >>> split4 = Split({1, 3}, {2, 4}) >>> is_compatible(split1, split4) False
- phylozoo.core.split.classifications.is_pairwise_compatible(system: SplitSystem) bool[source]#
Check if all pairs of splits in the system are compatible.
A split system is pairwise compatible if every pair of splits in the system is compatible with each other.
- Parameters:
system (SplitSystem) – The split system to check.
- Returns:
True if all pairs of splits are compatible, False otherwise.
- Return type:
Examples
>>> from phylozoo.core.split import Split, SplitSystem >>> split1 = Split({1, 2}, {3, 4}) >>> split2 = Split({1}, {2, 3, 4}) >>> split3 = Split({1, 2, 3}, {4}) >>> system = SplitSystem([split1, split2, split3]) >>> is_pairwise_compatible(system) True >>> split4 = Split({1, 3}, {2, 4}) >>> system2 = SplitSystem([split1, split4]) >>> is_pairwise_compatible(system2) False
- phylozoo.core.split.classifications.is_subsplit(split1: Split, split2: Split) bool[source]#
Check if one split is a subsplit of another split.
A split is a subsplit of another if one of its sides is a subset of one side of the other split, and the other side of this split is a subset of the other side of the other split. For example, 12|56 is a subsplit of 123|456.
- Parameters:
- Returns:
True if split1 is a subsplit of split2, False otherwise.
- Return type:
- Raises:
PhyloZooValueError – If either argument is not a Split instance.
Examples
>>> split1 = Split({1, 2, 6}, {3, 4, 5}) >>> split2 = Split({1, 2}, {3, 4}) >>> is_subsplit(split2, split1) True >>> split3 = Split({1, 3}, {2, 4}) >>> is_subsplit(split3, split1) False
- phylozoo.core.split.classifications.is_tree_compatible(system: SplitSystem) bool[source]#
Check if a split system is compatible with a tree.
A split system is tree-compatible if: 1. All pairs of splits are compatible (pairwise compatible) 2. All trivial splits are present in the system
- Parameters:
system (SplitSystem) – The split system to check.
- Returns:
True if the system is compatible with a tree, False otherwise.
- Return type:
Examples
>>> from phylozoo.core.split import Split, SplitSystem >>> # Tree-compatible system >>> split1 = Split({1}, {2, 3, 4}) >>> split2 = Split({2}, {1, 3, 4}) >>> split3 = Split({3}, {1, 2, 4}) >>> split4 = Split({4}, {1, 2, 3}) >>> split5 = Split({1, 2}, {3, 4}) >>> system = SplitSystem([split1, split2, split3, split4, split5]) >>> is_tree_compatible(system) True >>> # Incompatible system (splits conflict) >>> split6 = Split({1, 3}, {2, 4}) >>> system2 = SplitSystem([split1, split2, split3, split4, split5, split6]) >>> is_tree_compatible(system2) False
Algorithms#
Split system algorithms module.
This module provides algorithms for working with split systems, including conversion to phylogenetic networks and computation of distance matrices.
- phylozoo.core.split.algorithms.distances_from_splitsystem(system: SplitSystem | WeightedSplitSystem) DistanceMatrix[source]#
Compute distance matrix from a split system.
The distance between two elements x and y is the sum of weights of all splits that separate x and y. A split separates x and y if one element is in set1 and the other is in set2.
This function uses vectorized numpy operations for efficiency, avoiding nested Python loops by using boolean arrays and broadcasting.
- Parameters:
system (SplitSystem | WeightedSplitSystem) – The split system. If WeightedSplitSystem, split weights are used. If SplitSystem, each split has implicit weight 1.0.
- Returns:
A distance matrix on the elements of the split system, where the distance between x and y is the sum of weights of splits that separate them.
- Return type:
Examples
>>> from phylozoo.core.split.base import Split >>> split1 = Split({1, 2}, {3, 4}) >>> split2 = Split({1, 3}, {2, 4}) >>> weights = {split1: 2.0, split2: 1.5} >>> system = WeightedSplitSystem(weights) >>> dm = distances_from_splitsystem(system) >>> dm.get_distance(1, 2) # Separated by split2 only 1.5 >>> dm.get_distance(1, 3) # Separated by split1 only 2.0 >>> dm.get_distance(1, 4) # Separated by both splits 3.5 >>> dm.get_distance(2, 3) # Separated by both splits 3.5 >>> # Unweighted split system (each split has weight 1.0) >>> system2 = SplitSystem([split1, split2]) >>> dm2 = distances_from_splitsystem(system2) >>> dm2.get_distance(1, 4) # Separated by both splits, each with weight 1.0 2.0
- phylozoo.core.split.algorithms.induced_quartetsplits(split: Split, include_trivial: bool = False) set[Split][source]#
Return a set of all subsplits of size 4 of the split.
Generates all quartet splits (2|2 splits) that can be induced from this split by selecting 2 elements from each side.
- Parameters:
- Returns:
A set of quartet splits induced from this split.
- Return type:
Examples
>>> split = Split({1, 2, 3}, {4, 5, 6}) >>> quartets = induced_quartetsplits(split) >>> len(quartets) > 0 True
- phylozoo.core.split.algorithms.quartets_from_splitsystem(system: SplitSystem | WeightedSplitSystem) QuartetProfileSet[source]#
Compute quartet profile set from a split system.
For each split in the system, this function extracts all quartets induced by it (all 2|2 splits: 2 elements from one side, 2 from the other). Quartets are then grouped by their 4-taxon set into profiles, with weights equal to how often each quartet appeared (summing weights if the system is weighted).
- Parameters:
system (SplitSystem | WeightedSplitSystem) – The split system. If WeightedSplitSystem, split weights are used. If SplitSystem, each split has implicit weight 1.0.
- Returns:
A quartet profile set where each profile corresponds to a 4-taxon set, and contains quartets weighted by how often they appeared in the splits.
- Return type:
Examples
>>> from phylozoo.core.split.base import Split >>> split1 = Split({1, 2, 3}, {4, 5, 6}) >>> split2 = Split({1, 2}, {3, 4, 5, 6}) >>> system = SplitSystem([split1, split2]) >>> profileset = quartets_from_splitsystem(system) >>> len(profileset) > 0 True
- phylozoo.core.split.algorithms.tree_from_splitsystem(system: SplitSystem, check_compatibility: bool = True) SemiDirectedPhyNetwork[source]#
Convert a split system to a tree (SemiDirectedPhyNetwork).
Builds a tree that induces all splits in the system using a star tree approach:
Start with a star tree (center node connected to all leaves)
For each non-trivial split S = A|B, find a cut-vertex v whose partition is a refinement of S, replace v with a cut-edge (two internal nodes u and w connected by an edge), and reconnect components: parts in A connect to u, parts in B connect to w.
This approach iteratively refines the star tree by splitting cut-vertices into cut-edges, creating the final tree structure that displays induces all splits.
- Parameters:
system (SplitSystem) – The split system to convert to a tree.
check_compatibility (bool, optional) – Whether to check if the system is compatible with a tree before building. If False, assumes compatibility (e.g., if known by construction). By default True.
- Returns:
A tree network displaying all splits in the system.
- Return type:
- Raises:
PhyloZooValueError – If check_compatibility is True and the system is not tree-compatible. If a split cannot be created (indicating incompatibility).
Examples
>>> from phylozoo.core.split.base import Split >>> from phylozoo.core.network.sdnetwork.classifications import is_tree >>> split1 = Split({1, 2}, {3, 4}) >>> split2 = Split({1}, {2, 3, 4}) >>> split3 = Split({2}, {1, 3, 4}) >>> split4 = Split({3}, {1, 2, 4}) >>> split5 = Split({4}, {1, 2, 3}) >>> system = SplitSystem([split1, split2, split3, split4, split5]) >>> tree = tree_from_splitsystem(system) >>> is_tree(tree) True
I/O Support#
Split system I/O module.
Split systems support reading and writing in NEXUS format. This module provides format handlers registered with FormatRegistry for use with the IOMixin system.
The following format handlers are defined and registered:
nexus: NEXUS format for split systems (extensions: .nexus, .nex, .nxs).
Writer: to_nexus_split_system() converts SplitSystem to NEXUS string.
Reader: from_nexus_split_system() parses NEXUS string to SplitSystem.
Writer: to_nexus_weighted_split_system() converts WeightedSplitSystem to NEXUS string.
Reader: from_nexus_weighted_split_system() parses NEXUS string to WeightedSplitSystem.
These handlers are automatically registered when this module is imported. SplitSystem and WeightedSplitSystem inherit from IOMixin, so you can use:
system.save(‘file.nexus’) - Save to file (auto-detects format)
system.load(‘file.nexus’) - Load from file (auto-detects format)
system.to_string(format=’nexus’) - Convert to string
system.from_string(string, format=’nexus’) - Parse from string
SplitSystem.convert(‘in.nexus’, ‘out.nexus’) - Convert between formats
Notes
The NEXUS format for splits supports weights via the FORMAT WEIGHTS=YES option in the SPLITS block. WeightedSplitSystem uses this feature, while SplitSystem writes splits without weights.
- phylozoo.core.split.io.from_nexus_split_system(nexus_string: str, **kwargs: Any) SplitSystem[source]#
Parse a NEXUS format string and create a SplitSystem.
- Parameters:
nexus_string (str) – NEXUS format string containing split system data.
**kwargs – Additional arguments (currently unused, for compatibility).
- Returns:
Parsed split system.
- Return type:
- Raises:
PhyloZooParseError – If the NEXUS string is malformed or cannot be parsed (e.g., missing TAXA or SPLITS blocks, invalid split format, split sets overlap or don’t cover all taxa).
PhyloZooValueError – If weights are non-positive.
Examples
>>> from phylozoo.core.split.io import from_nexus_split_system >>> >>> nexus_str = '''#NEXUS ... ... BEGIN TAXA; ... DIMENSIONS NTAX=4; ... TAXLABELS ... 1 ... 2 ... 3 ... 4 ... ; ... END; ... ... BEGIN SPLITS; ... DIMENSIONS NSPLITS=2; ... FORMAT LABELS=YES; ... MATRIX ... [1] (1 2) (3 4) ... [2] (1 3) (2 4) ... ; ... END;''' >>> >>> system = from_nexus_split_system(nexus_str) >>> len(system) 2
Notes
This parser expects:
A TAXA block with TAXLABELS
A SPLITS block with FORMAT LABELS=YES (weights optional, ignored if present)
Split definitions in format: [n] (taxa1 taxa2 …) (taxa3 taxa4 …) [weight]
- phylozoo.core.split.io.from_nexus_weighted_split_system(nexus_string: str, **kwargs: Any) WeightedSplitSystem[source]#
Parse a NEXUS format string and create a WeightedSplitSystem.
- Parameters:
nexus_string (str) – NEXUS format string containing weighted split system data.
**kwargs – Additional arguments (currently unused, for compatibility).
- Returns:
Parsed weighted split system.
- Return type:
- Raises:
PhyloZooParseError – If the NEXUS string is malformed or cannot be parsed (e.g., missing TAXA or SPLITS blocks, invalid split format, missing weights when WEIGHTS=YES, invalid weight format).
PhyloZooValueError – If weights are non-positive.
Examples
>>> from phylozoo.core.split.io import from_nexus_weighted_split_system >>> >>> nexus_str = '''#NEXUS ... ... BEGIN TAXA; ... DIMENSIONS NTAX=4; ... TAXLABELS ... 1 ... 2 ... 3 ... 4 ... ; ... END; ... ... BEGIN SPLITS; ... DIMENSIONS NSPLITS=2; ... FORMAT LABELS=YES WEIGHTS=YES; ... MATRIX ... [1] (1 2) (3 4) 0.8 ... [2] (1 3) (2 4) 0.6 ... ; ... END;''' >>> >>> system = from_nexus_weighted_split_system(nexus_str) >>> len(system) 2
Notes
This parser expects:
A TAXA block with TAXLABELS
A SPLITS block with FORMAT LABELS=YES WEIGHTS=YES (or just LABELS=YES if weights optional)
Split definitions in format: [n] (taxa1 taxa2 …) (taxa3 taxa4 …) weight
If FORMAT WEIGHTS=YES is specified, all splits must have weights
- phylozoo.core.split.io.to_nexus_split_system(split_system: SplitSystem, **kwargs: Any) str[source]#
Convert a SplitSystem to a NEXUS format string.
- Parameters:
split_system (SplitSystem) – The split system to convert.
**kwargs – Additional arguments (currently unused, for compatibility).
- Returns:
The NEXUS format string representation of the split system.
- Return type:
Examples
>>> from phylozoo.core.split import Split, SplitSystem >>> from phylozoo.core.split.io import to_nexus_split_system >>> >>> split1 = Split({1, 2}, {3, 4}) >>> split2 = Split({1, 3}, {2, 4}) >>> system = SplitSystem([split1, split2]) >>> nexus_str = to_nexus_split_system(system) >>> print(nexus_str) #NEXUS BEGIN TAXA; DIMENSIONS NTAX=4; TAXLABELS 1 2 3 4 ; END; BEGIN SPLITS; DIMENSIONS NSPLITS=2; FORMAT LABELS=YES; MATRIX [1] (1 2) (3 4) [2] (1 3) (2 4) ; END;
Notes
The NEXUS format includes:
TAXA block with taxon labels
SPLITS block with split definitions (no weights for unweighted systems)
- phylozoo.core.split.io.to_nexus_weighted_split_system(weighted_system: WeightedSplitSystem, **kwargs: Any) str[source]#
Convert a WeightedSplitSystem to a NEXUS format string.
- Parameters:
weighted_system (WeightedSplitSystem) – The weighted split system to convert.
**kwargs – Additional arguments (currently unused, for compatibility).
- Returns:
The NEXUS format string representation of the weighted split system.
- Return type:
Examples
>>> from phylozoo.core.split import Split, WeightedSplitSystem >>> from phylozoo.core.split.io import to_nexus_weighted_split_system >>> >>> split1 = Split({1, 2}, {3, 4}) >>> split2 = Split({1, 3}, {2, 4}) >>> system = WeightedSplitSystem({split1: 0.8, split2: 0.6}) >>> nexus_str = to_nexus_weighted_split_system(system) >>> print(nexus_str) #NEXUS BEGIN TAXA; DIMENSIONS NTAX=4; TAXLABELS 1 2 3 4 ; END; BEGIN SPLITS; DIMENSIONS NSPLITS=2; FORMAT LABELS=YES WEIGHTS=YES; MATRIX [1] (1 2) (3 4) 0.800000 [2] (1 3) (2 4) 0.600000 ; END;
Notes
The NEXUS format includes:
TAXA block with taxon labels
SPLITS block with FORMAT WEIGHTS=YES and split definitions with weights