Documentation for TreeTools.

TreeTools.SplitListType
SplitList{T}
  • leaves::Array{T,1}: labels of leaves
  • splits::Array{Split,1}
  • mask::Array{Bool,1}: subset of leaves for which splits apply.
  • splitmap::Dict{T,Split}: indicate the split corresponding to the branch above a node. Only initialized if built from a tree with labels on internal nodes.

Constructors

SplitList(leaves::Array{T,1}) where T

Empty SplitList.

SplitList(t::Tree[, mask])

List of splits in t. mask defaults to ones.

SplitList(r::TreeNode, leaves[, mask])

Compute the list of splits below r, including r itself. Assume that r is part of a tree with leaves. mask defaults to the set of leaves that are descendents of r.

source
TreeTools.TreeNodeType
mutable struct TreeNode{T <: TreeNodeData}

Structural information on the tree, i.e. topology and branch length.

  • anc::Union{Nothing,TreeNode}: Ancestor
  • child::Array{TreeNode,1}: List of children
  • isleaf::Bool
  • isroot::Bool
  • tau::Union{Missing, Float64}
  • data::T
source
TreeTools.TreeNodeDataType
abstract type TreeNodeData

Abstract supertype for data attached to the dat field of TreeNode objects. Implemented concrete types are

  • EmptyData: empty struct. Use if you do not have to attach data to nodes.
  • MiscData: contains a dictionary for attaching extra data to nodes. Also behaves like a Dict for indexing/iterating, e.g. x.dat[key] == x[key].
source
Base.insert!Method
insert!(tree, node; name, time)

Insert a singleton named name above node, at height time on the branch. Return the inserted singleton. time can be a Number or missing.

source
TreeTools.ancestorsMethod
ancestors(node::TreeNode)

Return all ancestors of node up to and including the root, as a Vector{TreeNode}. The first element is node itself.

source
TreeTools.arecompatibleMethod
arecompatible(s::Split, t::Split[, mask::Array{Bool}])

Are splits s and t compatible in the cluster sense.

arecompatible(s::SplitList, i::Integer, j::Integer; mask=true)

Are s.splits[i] and s.splits[j] compatible in the cluster sense?

source
TreeTools.binarize!Method
binarize!(t::Tree; time=0.)

Make t binary by adding arbitrary internal nodes with branch length time.

source
TreeTools.check_treeMethod
check_tree(t::Tree; strict=true)
  • Every non-leaf node should have at least one child (two if strict)
  • Every non-root node should have exactly one ancestor
  • If n.child[...] == c, c.anc == n is true
  • Tree has only one root
source
TreeTools.delete_branches!Method
delete_branches!(f, tree::Tree)
delete_branches!(f, n::TreeNode)

Delete branch above node n if f(n) returns true when called on node n. When called on a Tree propagates recursively down the tree. Only when called on a Tree will nodes be additionally deleted from the lnodes and lleaves dictionaries. If keep_time, the branch length of the deleted branches will be added to child branches.

source
TreeTools.delete_null_branches!Method
delete_null_branches!(tree::Tree; threshold=1e-10)

Delete internal node with branch length smaller than threshold. Propagates recursively down the tree. For leaf nodes, set branch length to 0 if smaller than threshold.

source
TreeTools.depthMethod
depth(node::TreeNode)

Topological distance from node to root (number of edges).

source
TreeTools.diameterMethod
diameter(tree::Tree; topological=false)

Diameter of tree: the longest path between any two leaves. If topological=true, counts edges instead of summing branch lengths.

source
TreeTools.distanceMethod
distance(t1::Tree, t2::Tree; type = :RF, normalize = false)

Compute distance between two trees. See TreeTools.tree_distance_types for allowed types. If normalize, the distance is normalized to [0,1].

source
TreeTools.distanceMethod
distance(t::Tree, n1::AbstractString, n2::AbstractString; topological=false)
distance(n1::TreeNode, n2::TreeNode; topological=false)

Compute branch length distance between n1 and n2 by summing the branch_length values. If topological, the value 1. is summed instead, counting the number of branches separating the two nodes (Note: the output is not an Int!).

source
TreeTools.distance_matrixMethod
distance_matrix(tree::Tree)

Pairwise distance between leaves. The element D[i,j] is the distance along branches between leaf i and j. Leaves are arranged in post-order.

source
TreeTools.graft!Method
graft!(tree::Tree, n, r; graft_on_leaf=false, time=branch_length(n))

Graft n onto r and return the grafted node. r can be a label or a TreeNode, and should belong to tree. n can be a TreeNode or a Tree. In the latter case, n will be copied before being grafted. None of the nodes of the subtree of n should belong to tree.

If r is a leaf and graft_on_leaf is set to false (default), will raise an error.

source
TreeTools.heightMethod
height(tree::Tree; topological=false)

Height of tree: the maximum distance from root to any leaf. If topological=true, counts edges instead of summing branch lengths.

source
TreeTools.internalsFunction
nodes(t; skiproot=false)
leaves(t)
internals(t; skiproot=false)

Iterator over all nodes / leaves / internal nodes of a tree. If skiproot, the root node will be skipped by the iterator.

Note

length cannot be called on internals(t) as the latter is based on Iterators.filter. A way to get the number of internal nodes of a tree is for example by calling length(nodes(t)) - length(leaves(t)).

source
TreeTools.is_ancestorMethod
is_ancestor(t::Tree, a::AbstractString, n::AbstractString)
is_ancestor(a::TreeNode, n::TreeNode)

Check if a is an ancestor of n, in the sense that ancestor(ancestor(...(node))) == a.

source
TreeTools.iscompatibleFunction
iscompatible(s::Split, S::SplitList, mask=S.mask; usemask=true)

Is s compatible with all splits in S?

source
TreeTools.isleafMethod
isleaf(s::Split)
isleaf(s::Split, mask::Array{Bool,1})

Check if s is a leaf split.

source
TreeTools.isrootMethod
isroot(s::Split, mask::Array{Bool,1})

Check if s is the root split when restricted to mask.

source
TreeTools.label!Method
label!(t::Tree, node::TreeNode, new_label::AbstractString)
label!(t::Tree, old_label, new_label)

Change the label of a TreeNode to new_label.

source
TreeTools.ladderize!Method
ladderize!(tree::Tree)

Ladderize tree by placing nodes with largest clades left in the newick string.

source
TreeTools.lcaMethod
lca(t::Tree, labels::Array{<:AbstractString,1})
lca(t::Tree, labels...)
source
TreeTools.lcaMethod
lca(i_node::TreeNode, j_node::TreeNode)

Find and return lowest common ancestor of i_node and j_node. Idea is to go up in the tree in an asymmetric way on the side of the deeper node, until both are at equal distance from root. Then, problem is solved by going up in a symmetric way. (https://stackoverflow.com/questions/1484473/how-to-find-the-lowest-common-ancestor-of-two-nodes-in-any-binary-tree/6183069#6183069)

source
TreeTools.lcaMethod
lca(nodelist)

Find the common ancestor of all nodes in nodelist. nodelist is an iterable collection of TreeNode objects.

source
TreeTools.leavesFunction
nodes(t; skiproot=false)
leaves(t)
internals(t; skiproot=false)

Iterator over all nodes / leaves / internal nodes of a tree. If skiproot, the root node will be skipped by the iterator.

Note

length cannot be called on internals(t) as the latter is based on Iterators.filter. A way to get the number of internal nodes of a tree is for example by calling length(nodes(t)) - length(leaves(t)).

source
TreeTools.leavesMethod
leaves(S::SplitList, i)

Return array of leaves in S.splits[i], taking S.mask into account.

source
TreeTools.newickMethod
newick(tree::Tree; internal_labels=true, write_root=true)

Return the Newick string correpsonding to tree. If internal_labels == false, do not write labels of internal nodes in the string. If !write_root, do not write label and time for the root node (unrooted tree).

source
TreeTools.node2treeMethod
node2tree(root::TreeNode{T}; label = default_tree_label(), force_new_labels=false)

Create a Tree object from root with name label. If force_new_labels, a random string is added to node labels to make them unique.

source
TreeTools.nodesFunction
nodes(t; skiproot=false)
leaves(t)
internals(t; skiproot=false)

Iterator over all nodes / leaves / internal nodes of a tree. If skiproot, the root node will be skipped by the iterator.

Note

length cannot be called on internals(t) as the latter is based on Iterators.filter. A way to get the number of internal nodes of a tree is for example by calling length(nodes(t)) - length(leaves(t)).

source
TreeTools.parse_newick_stringMethod
parse_newick_string(
	nw::AbstractString;
	node_data_type=DEFAULT_NODE_DATATYPE, force_new_labels=false
)

Parse newick string into a tree. See read_tree for more informations.

source
TreeTools.postorder_traversalMethod
postorder_traversal([f], tree; root=true, leaves=true, internals=true)

Traverse tree in postorder, iterating over nodes. The children of node are returned in the same order as children(node).

Keep only nodes n such that f(n) is true. If leaves, internals or root are set to false, the corresponding nodes are excluded.

Examples

for node in postorder_traversal(tree; root=false)
    isroot(node) # always false
end

[label(x) for x in postorder_traversal(tree; internals=false)] # labels of leaf nodes
source
TreeTools.print_tree_asciiMethod
print_tree_ascii(io, t::Tree)

Julia implementation of Bio.Phylo.drawascii function: https://github.com/biopython/biopython/blob/master/Bio/Phylo/utils.py

source
TreeTools.prune!Method
prune!(tree, node; kwargs...)
prune!(tree, labels::AbstractArray)
prune!(tree, labels...)

Prune node from tree. node can be a label or a TreeNode. Return the subtree defined by node as a Tree object as well as the previous ancestor of node.

If a list of labels is provided, the MRCA of the corresponding nodes is pruned.

kwargs

  • remove_singletons: remove singletons (internals with one child) in the tree after pruning. Default true.
  • clade_only: if a list of labels is provided, check that it corresponds to a clade before pruning. Default true.
  • create_leaf: if the ancestor of r has only one child (singleton), pruning r will create a leaf. If create_leaf == :warn, this will trigger a warning. If create_leaf = false, it will trigger an error. If create_leaf = true, then this is allowed. Default: :warn.

Example

using TreeTools # hide
tree = parse_newick_string("(A:1.,(B:1.,(X1:0.,X2:0.)X:5.)BX:1.)R;")
prune!(tree, ["X1", "X2"])
map(label, nodes(tree))

# output

3-element Vector{String}:
 "B"
 "A"
 "R"
source
TreeTools.read_treeMethod
read_tree(
	nwk_filename::AbstractString;
	node_data_type=DEFAULT_NODE_DATATYPE,
    label,
    force_new_labels=false,
    check=true,
)
read_tree(io::IO; kwargs...)

Read Newick file and create a Tree{node_data_type} object from it. If the input file contains multiple Newick strings on different lines, the output is an array of Tree objects.

The call node_data_type() must return a valid instance of a subtype of TreeNodeData. You can implement your own subtypes, or see ?TreeNodeData for already implemented ones. The default is EmptyData.

Use force_new_labels=true to force the renaming of all internal nodes. By default the tree will be assigned a label by calling default_tree_label(). This can be changed using the label argument.

If you have a variable containing a Newick string and want to build a tree from it, use parse_newick_string instead.

Note on labels

The Tree type identifies nodes by their labels. This means that labels have to be unique. For this reason, the following is done when reading a tree:

  • if an internal node does not have a label, a unique one will be created of the form "NODE_i"
  • if a node has a label that was already found before in the tree, a random identifier will be appended to it to make it unique. Note that the identifier is created using randstring(8), unicity is technically not guaranteed.
  • if force_new_labels is used, a unique identifier is appended to node labels
  • if an internal node label is identified as a confidence/bootstrap value, a unique identifier is appended to it so it can serve as a plain node label. Bootstrap values are not parsed. Leaf labels are never treated as bootstrap values. See ?TreeTools.isbootstrap for which labels are identified as confidence values.
source
TreeTools.root!Method
root!(tree; method=:midpoint, topological = false)

Root tree using method. Only implemented method is :midpoint.

Methods

:midpoint

Distance between nodes can be either topological (number of branches) or based on branch length. Does not create a polytomy at the root: if the midpoint is an already existing internal node (not the root), creates a new root node at infinitesimal distance below it.

Midpoint rooting will exit without doing anything if

  • the distance between the two farthest leaves is 0. This happens if all branch lengths are 0.
  • the current root is already the midpoint.

:model

Provide keyword argument model::Tree. Try to root tree like model. If the two trees only differ by rooting, they will have the same topology at the end of this. Else, tree will be rerooted but a warning will be given.

source
TreeTools.root!Method
root!(tree::Tree, node::AbstractString; root_on_leaf, time=0., remove_singletons=true)

Root tree at tree.lnodes[node]. Equivalent to outgroup rooting. If time is non-zero, root above node at height time, inserting a new node.

If remove_singletons, singleton nodes are removed after re-rooting. This is useful to remove the old root, which often ends up being a singleton.

source
TreeTools.rootMethod
root(node::TreeNode; maxdepth=1000)

Return root of the tree to which node belongs.

source
TreeTools.traversalFunction
traversal([f], tree, style=:postorder; internals, leaves, root)
traversal([f], node, style=:postorder; internals, leaves, root)

Iterate through nodes of tree according to style, skipping nodes for which f returns false. style must be in collect(keys(TreeTools.traversal_styles)). For now its just :postorder.

See postorder_traversal for extended docstring.

source
TreeTools.write_newickMethod
write_newick(io::IO, tree::Tree; kwargs...)
write_newick(filename::AbstractString, tree::Tree, mode="w"; kwargs...)

Write Newick string corresponding to tree to io or filename. Prefer write(io, tree) / write(filename, tree) to write to IO or a file, and newick(tree) to obtain the Newick string directly.

source
TreeTools.Generate.YuleCoalescentType
mutable struct YuleCoalescent
n::Int = 2
b::Float64 = 1 # birth rate

Rate of coalescence (n-1)b. The expected height of the tree is (I think) ~log(n)/b

source
TreeTools.Generate.balanced_binary_treeFunction
balanced_binary_tree(n::Integer, time::Union{Missing, Real} = missing)

Return a balanced binary tree of n nodes with all branches of length time. n must be a power of 2.

source
TreeTools.Generate.birth_deathMethod
birth_death(n::Integer, λ::Real, μ::Real; active=false, warn_incomplete=true)

Simulate a birth death process with rates λ (birth) and μ (death). Stop when there are n lineages. Return the a Tree and a boolean indicating completion of the process: it is false if all lineages died before reaching the target n.

Keyword argument active (default: false) controls whether only active lineages (i.e. non dead) count towards completion. If active=false, the number of leaves of output tree is n if the process completed. If active=true, it is larger than n.

source
TreeTools.Generate.genealogyMethod
genealogy(C::Coalescent; coalescence_times = false)

Return a tree sampled from the coalescent C. If coalescence_times, also return the times Tn during which n lineages are present, in the form of a dictionary n => Tn.

Examples

For the Kingman coalescent with population size N and n=100 lineages:

tree = genealogy(KingmanCoalescent(100, N))
source
TreeTools.Generate.ladder_treeFunction
ladder_tree(n[, t=missing])

Return a ladder tree with n leaves with total height t. For 4 leaves A, B, C, D, this would be (A:t,(B:2t/3,(C:t/3,D:t/3)));. The elementary branch length is t/(n-1) if n>1.

source