Documentation for TreeTools.
TreeTools.Generate.YuleCoalescentTreeTools.MiscDataTreeTools.SplitTreeTools.SplitListTreeTools.TreeTreeTools.TreeNodeTreeTools.TreeNodeDataBase.insert!TreeTools.Generate.balanced_binary_treeTreeTools.Generate.birth_deathTreeTools.Generate.genealogyTreeTools.Generate.ladder_treeTreeTools.Generate.star_treeTreeTools.ancestorsTreeTools.arecompatibleTreeTools.binarize!TreeTools.branch_length!TreeTools.check_treeTreeTools.data!TreeTools.delete_branches!TreeTools.delete_null_branches!TreeTools.depthTreeTools.diameterTreeTools.distanceTreeTools.distanceTreeTools.distance_matrixTreeTools.graft!TreeTools.heightTreeTools.internalsTreeTools.is_ancestorTreeTools.iscompatibleTreeTools.isleafTreeTools.isrootTreeTools.isrootTreeTools.label!TreeTools.label!TreeTools.ladderize!TreeTools.lcaTreeTools.lcaTreeTools.lcaTreeTools.leavesTreeTools.leavesTreeTools.newickTreeTools.node2treeTreeTools.nodesTreeTools.parse_newick_stringTreeTools.postorder_traversalTreeTools.print_tree_asciiTreeTools.prune!TreeTools.read_treeTreeTools.read_treeTreeTools.rootTreeTools.root!TreeTools.root!TreeTools.share_labelsTreeTools.traversalTreeTools.write_newick
TreeTools.MiscData — Type
struct MiscData <: TreeNodeData
dat::Dict{Any,Any}
endTreeTools.Split — Type
Splitdat::Array{Int,1}: indices of leaves in the split
TreeTools.SplitList — Type
SplitList{T}leaves::Array{T,1}: labels of leavessplits::Array{Split,1}mask::Array{Bool,1}: subset of leaves for which splits apply.splitmap::Dict{T,Split}: indicate the split corresponding to the branch above a node. Only initialized if built from a tree with labels on internal nodes.
Constructors
SplitList(leaves::Array{T,1}) where TEmpty SplitList.
SplitList(t::Tree[, mask])List of splits in t. mask defaults to ones.
SplitList(r::TreeNode, leaves[, mask])Compute the list of splits below r, including r itself. Assume that r is part of a tree with leaves. mask defaults to the set of leaves that are descendents of r.
TreeTools.Tree — Type
mutable struct Tree{T <: TreeNodeData}TreeTools.TreeNode — Type
mutable struct TreeNode{T <: TreeNodeData}Structural information on the tree, i.e. topology and branch length.
anc::Union{Nothing,TreeNode}: Ancestorchild::Array{TreeNode,1}: List of childrenisleaf::Boolisroot::Booltau::Union{Missing, Float64}data::T
TreeTools.TreeNodeData — Type
abstract type TreeNodeDataAbstract supertype for data attached to the dat field of TreeNode objects. Implemented concrete types are
EmptyData: empty struct. Use if you do not have to attach data to nodes.MiscData: contains a dictionary for attaching extra data to nodes. Also behaves like aDictfor indexing/iterating, e.g.x.dat[key] == x[key].
Base.insert! — Method
insert!(tree, node; name, time)Insert a singleton named name above node, at height time on the branch. Return the inserted singleton. time can be a Number or missing.
TreeTools.ancestors — Method
ancestors(node::TreeNode)Return all ancestors of node up to and including the root, as a Vector{TreeNode}. The first element is node itself.
TreeTools.arecompatible — Method
arecompatible(s::Split, t::Split[, mask::Array{Bool}])Are splits s and t compatible in the cluster sense.
arecompatible(s::SplitList, i::Integer, j::Integer; mask=true)Are s.splits[i] and s.splits[j] compatible in the cluster sense?
TreeTools.binarize! — Method
binarize!(t::Tree; time=0.)Make t binary by adding arbitrary internal nodes with branch length time.
TreeTools.branch_length! — Method
branch_length!(n::TreeNode, τ)Set the branch length above n to τ.
TreeTools.check_tree — Method
check_tree(t::Tree; strict=true)- Every non-leaf node should have at least one child (two if
strict) - Every non-root node should have exactly one ancestor
- If n.child[...] == c, c.anc == n is true
- Tree has only one root
TreeTools.data! — Method
data!(n::TreeNode{T}, dat::T)Set the the data field of n to dat.
TreeTools.delete_branches! — Method
delete_branches!(f, tree::Tree)
delete_branches!(f, n::TreeNode)Delete branch above node n if f(n) returns true when called on node n. When called on a Tree propagates recursively down the tree. Only when called on a Tree will nodes be additionally deleted from the lnodes and lleaves dictionaries. If keep_time, the branch length of the deleted branches will be added to child branches.
TreeTools.delete_null_branches! — Method
delete_null_branches!(tree::Tree; threshold=1e-10)Delete internal node with branch length smaller than threshold. Propagates recursively down the tree. For leaf nodes, set branch length to 0 if smaller than threshold.
TreeTools.depth — Method
depth(node::TreeNode)Topological distance from node to root (number of edges).
TreeTools.diameter — Method
diameter(tree::Tree; topological=false)Diameter of tree: the longest path between any two leaves. If topological=true, counts edges instead of summing branch lengths.
TreeTools.distance — Method
distance(t1::Tree, t2::Tree; type = :RF, normalize = false)Compute distance between two trees. See TreeTools.tree_distance_types for allowed types. If normalize, the distance is normalized to [0,1].
TreeTools.distance — Method
distance(t::Tree, n1::AbstractString, n2::AbstractString; topological=false)
distance(n1::TreeNode, n2::TreeNode; topological=false)Compute branch length distance between n1 and n2 by summing the branch_length values. If topological, the value 1. is summed instead, counting the number of branches separating the two nodes (Note: the output is not an Int!).
TreeTools.distance_matrix — Method
distance_matrix(tree::Tree)Pairwise distance between leaves. The element D[i,j] is the distance along branches between leaf i and j. Leaves are arranged in post-order.
TreeTools.graft! — Method
graft!(tree::Tree, n, r; graft_on_leaf=false, time=branch_length(n))Graft n onto r and return the grafted node. r can be a label or a TreeNode, and should belong to tree. n can be a TreeNode or a Tree. In the latter case, n will be copied before being grafted. None of the nodes of the subtree of n should belong to tree.
If r is a leaf and graft_on_leaf is set to false (default), will raise an error.
TreeTools.height — Method
height(tree::Tree; topological=false)Height of tree: the maximum distance from root to any leaf. If topological=true, counts edges instead of summing branch lengths.
TreeTools.internals — Function
nodes(t; skiproot=false)
leaves(t)
internals(t; skiproot=false)Iterator over all nodes / leaves / internal nodes of a tree. If skiproot, the root node will be skipped by the iterator.
Note
length cannot be called on internals(t) as the latter is based on Iterators.filter. A way to get the number of internal nodes of a tree is for example by calling length(nodes(t)) - length(leaves(t)).
TreeTools.is_ancestor — Method
is_ancestor(t::Tree, a::AbstractString, n::AbstractString)
is_ancestor(a::TreeNode, n::TreeNode)Check if a is an ancestor of n, in the sense that ancestor(ancestor(...(node))) == a.
TreeTools.iscompatible — Function
iscompatible(s::Split, S::SplitList, mask=S.mask; usemask=true)Is s compatible with all splits in S?
TreeTools.isleaf — Method
isleaf(s::Split)
isleaf(s::Split, mask::Array{Bool,1})Check if s is a leaf split.
TreeTools.isroot — Method
isroot(s::Split, mask::Array{Bool,1})Check if s is the root split when restricted to mask.
TreeTools.isroot — Method
isroot(S::SplitList, i)Is S[i] the root?
TreeTools.label! — Method
label!(t::Tree, label::AbstractString)Change the label of the Tree to label.
TreeTools.label! — Method
label!(t::Tree, node::TreeNode, new_label::AbstractString)
label!(t::Tree, old_label, new_label)Change the label of a TreeNode to new_label.
TreeTools.ladderize! — Method
ladderize!(tree::Tree)Ladderize tree by placing nodes with largest clades left in the newick string.
TreeTools.lca — Method
lca(t::Tree, labels::Array{<:AbstractString,1})
lca(t::Tree, labels...)TreeTools.lca — Method
lca(i_node::TreeNode, j_node::TreeNode)Find and return lowest common ancestor of i_node and j_node. Idea is to go up in the tree in an asymmetric way on the side of the deeper node, until both are at equal distance from root. Then, problem is solved by going up in a symmetric way. (https://stackoverflow.com/questions/1484473/how-to-find-the-lowest-common-ancestor-of-two-nodes-in-any-binary-tree/6183069#6183069)
TreeTools.lca — Method
lca(nodelist)Find the common ancestor of all nodes in nodelist. nodelist is an iterable collection of TreeNode objects.
TreeTools.leaves — Function
nodes(t; skiproot=false)
leaves(t)
internals(t; skiproot=false)Iterator over all nodes / leaves / internal nodes of a tree. If skiproot, the root node will be skipped by the iterator.
Note
length cannot be called on internals(t) as the latter is based on Iterators.filter. A way to get the number of internal nodes of a tree is for example by calling length(nodes(t)) - length(leaves(t)).
TreeTools.leaves — Method
leaves(S::SplitList, i)Return array of leaves in S.splits[i], taking S.mask into account.
TreeTools.newick — Method
newick(tree::Tree; internal_labels=true, write_root=true)Return the Newick string correpsonding to tree. If internal_labels == false, do not write labels of internal nodes in the string. If !write_root, do not write label and time for the root node (unrooted tree).
TreeTools.node2tree — Method
node2tree(root::TreeNode{T}; label = default_tree_label(), force_new_labels=false)Create a Tree object from root with name label. If force_new_labels, a random string is added to node labels to make them unique.
TreeTools.nodes — Function
nodes(t; skiproot=false)
leaves(t)
internals(t; skiproot=false)Iterator over all nodes / leaves / internal nodes of a tree. If skiproot, the root node will be skipped by the iterator.
Note
length cannot be called on internals(t) as the latter is based on Iterators.filter. A way to get the number of internal nodes of a tree is for example by calling length(nodes(t)) - length(leaves(t)).
TreeTools.parse_newick_string — Method
parse_newick_string(
nw::AbstractString;
node_data_type=DEFAULT_NODE_DATATYPE, force_new_labels=false
)Parse newick string into a tree. See read_tree for more informations.
TreeTools.postorder_traversal — Method
postorder_traversal([f], tree; root=true, leaves=true, internals=true)Traverse tree in postorder, iterating over nodes. The children of node are returned in the same order as children(node).
Keep only nodes n such that f(n) is true. If leaves, internals or root are set to false, the corresponding nodes are excluded.
Examples
for node in postorder_traversal(tree; root=false)
isroot(node) # always false
end
[label(x) for x in postorder_traversal(tree; internals=false)] # labels of leaf nodesTreeTools.print_tree_ascii — Method
print_tree_ascii(io, t::Tree)Julia implementation of Bio.Phylo.drawascii function: https://github.com/biopython/biopython/blob/master/Bio/Phylo/utils.py
TreeTools.prune! — Method
prune!(tree, node; kwargs...)
prune!(tree, labels::AbstractArray)
prune!(tree, labels...)Prune node from tree. node can be a label or a TreeNode. Return the subtree defined by node as a Tree object as well as the previous ancestor of node.
If a list of labels is provided, the MRCA of the corresponding nodes is pruned.
kwargs
remove_singletons: remove singletons (internals with one child) in the tree after pruning. Defaulttrue.clade_only: if a list of labels is provided, check that it corresponds to a clade before pruning. Defaulttrue.create_leaf: if the ancestor ofrhas only one child (singleton), pruningrwill create a leaf. Ifcreate_leaf == :warn, this will trigger a warning. Ifcreate_leaf = false, it will trigger an error. Ifcreate_leaf = true, then this is allowed. Default::warn.
Example
using TreeTools # hide
tree = parse_newick_string("(A:1.,(B:1.,(X1:0.,X2:0.)X:5.)BX:1.)R;")
prune!(tree, ["X1", "X2"])
map(label, nodes(tree))
# output
3-element Vector{String}:
"B"
"A"
"R"TreeTools.read_tree — Method
read_tree(
nwk_filename::AbstractString;
node_data_type=DEFAULT_NODE_DATATYPE,
label,
force_new_labels=false,
check=true,
)
read_tree(io::IO; kwargs...)Read Newick file and create a Tree{node_data_type} object from it. If the input file contains multiple Newick strings on different lines, the output is an array of Tree objects.
The call node_data_type() must return a valid instance of a subtype of TreeNodeData. You can implement your own subtypes, or see ?TreeNodeData for already implemented ones. The default is EmptyData.
Use force_new_labels=true to force the renaming of all internal nodes. By default the tree will be assigned a label by calling default_tree_label(). This can be changed using the label argument.
If you have a variable containing a Newick string and want to build a tree from it, use parse_newick_string instead.
Note on labels
The Tree type identifies nodes by their labels. This means that labels have to be unique. For this reason, the following is done when reading a tree:
- if an internal node does not have a label, a unique one will be created of the form
"NODE_i" - if a node has a label that was already found before in the tree, a random identifier will be appended to it to make it unique. Note that the identifier is created using
randstring(8), unicity is technically not guaranteed. - if
force_new_labelsis used, a unique identifier is appended to node labels - if an internal node label is identified as a confidence/bootstrap value, a unique identifier is appended to it so it can serve as a plain node label. Bootstrap values are not parsed. Leaf labels are never treated as bootstrap values. See
?TreeTools.isbootstrapfor which labels are identified as confidence values.
TreeTools.root! — Method
root!(tree; method=:midpoint, topological = false)Root tree using method. Only implemented method is :midpoint.
Methods
:midpoint
Distance between nodes can be either topological (number of branches) or based on branch length. Does not create a polytomy at the root: if the midpoint is an already existing internal node (not the root), creates a new root node at infinitesimal distance below it.
Midpoint rooting will exit without doing anything if
- the distance between the two farthest leaves is
0. This happens if all branch lengths are 0. - the current root is already the midpoint.
:model
Provide keyword argument model::Tree. Try to root tree like model. If the two trees only differ by rooting, they will have the same topology at the end of this. Else, tree will be rerooted but a warning will be given.
TreeTools.root! — Method
root!(tree::Tree, node::AbstractString; root_on_leaf, time=0., remove_singletons=true)Root tree at tree.lnodes[node]. Equivalent to outgroup rooting. If time is non-zero, root above node at height time, inserting a new node.
If remove_singletons, singleton nodes are removed after re-rooting. This is useful to remove the old root, which often ends up being a singleton.
TreeTools.root — Method
root(node::TreeNode; maxdepth=1000)Return root of the tree to which node belongs.
TreeTools.share_labels — Method
share_labels(tree1, tree2)Check if tree1 and tree2 share the same labels for leaf nodes.
TreeTools.traversal — Function
traversal([f], tree, style=:postorder; internals, leaves, root)
traversal([f], node, style=:postorder; internals, leaves, root)Iterate through nodes of tree according to style, skipping nodes for which f returns false. style must be in collect(keys(TreeTools.traversal_styles)). For now its just :postorder.
See postorder_traversal for extended docstring.
TreeTools.write_newick — Method
write_newick(io::IO, tree::Tree; kwargs...)
write_newick(filename::AbstractString, tree::Tree, mode="w"; kwargs...)Write Newick string corresponding to tree to io or filename. Prefer write(io, tree) / write(filename, tree) to write to IO or a file, and newick(tree) to obtain the Newick string directly.
TreeTools.Generate.YuleCoalescent — Type
mutable struct YuleCoalescentn::Int = 2
b::Float64 = 1 # birth rateRate of coalescence (n-1)b. The expected height of the tree is (I think) ~log(n)/b
TreeTools.Generate.balanced_binary_tree — Function
balanced_binary_tree(n::Integer, time::Union{Missing, Real} = missing)Return a balanced binary tree of n nodes with all branches of length time. n must be a power of 2.
TreeTools.Generate.birth_death — Method
birth_death(n::Integer, λ::Real, μ::Real; active=false, warn_incomplete=true)Simulate a birth death process with rates λ (birth) and μ (death). Stop when there are n lineages. Return the a Tree and a boolean indicating completion of the process: it is false if all lineages died before reaching the target n.
Keyword argument active (default: false) controls whether only active lineages (i.e. non dead) count towards completion. If active=false, the number of leaves of output tree is n if the process completed. If active=true, it is larger than n.
TreeTools.Generate.genealogy — Method
genealogy(C::Coalescent; coalescence_times = false)Return a tree sampled from the coalescent C. If coalescence_times, also return the times Tn during which n lineages are present, in the form of a dictionary n => Tn.
Examples
For the Kingman coalescent with population size N and n=100 lineages:
tree = genealogy(KingmanCoalescent(100, N))TreeTools.Generate.ladder_tree — Function
ladder_tree(n[, t=missing])Return a ladder tree with n leaves with total height t. For 4 leaves A, B, C, D, this would be (A:t,(B:2t/3,(C:t/3,D:t/3)));. The elementary branch length is t/(n-1) if n>1.
TreeTools.Generate.star_tree — Function
star_tree(n, times)Create a star tree with n leaves. times can be an iterable of length n or a number/missing.