FASTA#
FASTA is a simple text-based format for sequence data, widely used in bioinformatics.
Each sequence has a header line starting with > followed by one or more lines of
sequence characters.
See also
FASTA format — Wikipedia
Classes and extensions#
Classes: MSA (default format)
File extensions: .fasta, .fa, .fas
Structure#
>taxon1
ACGTACGT
>taxon2
TGCAACGT
>taxon3
AAAAACGT
Examples#
from phylozoo import MSA
sequences = {
"taxon1": "ACGTACGT",
"taxon2": "TGCAACGT",
"taxon3": "AAAAACGT"
}
msa = MSA(sequences)
msa.save("alignment.fasta")
msa2 = MSA.load("alignment.fasta")
fasta_str = msa.to_string(format="fasta", line_length=60)
See also#
I/O Operations — Save/load and format detection
NEXUS — NEXUS Characters block for alignments
Multiple Sequence Alignments — MSA and sequences