PottsGraph
The PottsGraph structure represents a Potts model, a statistical model used to describe interactions in a system with multiple states (e.g., amino acids in a protein sequence). It is defined by the following fields:
J: A 4-dimensional array of dimensionsq x q x L x Lrepresenting the coupling parameters between states at different positions.h: A 2-dimensional array of dimensionsq x Lrepresenting the local field parameters for each state at each position.β: The inverse temperature parameter, used during sampling.alphabet: An optionalAlphabetobject to map states to characters (e.g., amino acids), see the BioSequenceMappings.jl package.
q represents the number of states (e.g., 21 for amino acids + gap), and L is the length of the sequence.
Structure and Fields
The PottsGraph is defined as follows:
@kwdef mutable struct PottsGraph{T<:AbstractFloat}
J::Array{T,4}
h::Array{T,2}
β::T = 1.0
alphabet::Union{Nothing, BioAequenceMappings.Alphabet{Char,<:Integer}} = aa_alphabet
endCreating a PottsGraph
You can create a PottsGraph with null or random initialization:
julia> using PottsEvolverjulia> L, q = (10, 21);julia> g_null = PottsGraph(L, q, Float64; init=:null); # null initializationjulia> g_null.J[:, :, 1, 2]21×21 Matrix{Float64}: 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 … 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 … 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ⋮ ⋮ ⋱ ⋮ ⋮ 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 … 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 … 0.0 0.0 0.0 0.0 0.0 0.0 0.0julia> g_rand = PottsGraph(L, q, Float64; init=:rand); # random initializationjulia> size(g_rand)(L = 10, q = 21)julia> g_rand.J[:, :, 1, 2] # coupling matrix between sites 1 and 221×21 Matrix{Float64}: -0.0603865 -0.06204 -0.202254 … -0.0920158 0.0503226 -0.0880537 0.154888 -0.0368273 0.0788336 0.200912 0.0364512 0.0151703 0.059547 -0.112366 0.0233217 -0.0959824 0.0844823 0.0500079 -0.0398984 0.015367 -0.314799 -0.0834199 0.0858495 -0.287309 -0.0437886 -0.0635159 -0.0100393 -0.0519691 … -0.00200401 -0.0711762 0.0219989 0.111479 -0.141196 0.0283503 0.0683627 0.143414 -0.0159089 -0.0261254 0.0441965 -0.184584 0.0123456 -0.118256 -0.0486653 -0.0878733 -0.0406157 -0.117116 -0.0529542 -0.130584 0.0630611 0.163666 ⋮ ⋱ ⋮ -0.0384241 0.0127512 -0.0372773 -0.0366409 -0.0687255 -0.0645968 -0.153036 0.0849313 -0.039283 0.159733 -0.0635798 -0.0799232 -0.0800636 -0.0858292 -0.0274848 -0.171622 0.19772 -0.0999202 … 0.0293537 -0.0537473 -0.190267 0.0126827 0.0906458 0.0712207 -0.00468208 0.205619 0.147886 -0.0482052 0.0669406 0.0906053 -0.0533296 0.0544218 -0.0560534 -0.179928 0.125887 -0.0165422 -0.0362424 -0.00131298 -0.00564719 -0.0366123 0.149738 -0.172874 0.0584519 … 0.121888 -0.00718466
It is sometimes easier to view the couplings J as a square matrix.
julia> J_matrix = PottsEvolver.couplings_as_matrix(g_rand.J);
julia> size(J_matrix) # (L*q x L*q)
(420, 420)
julia> J_matrix ≈ J_matrix' # The coupling matrix should be symetric
true
julia> J_tensor = PottsEvolver.matrix_as_couplings(J_matrix, L, q); # back to tensor
julia> J_tensor ≈ g_rand.J
trueThere is no check that the coupling matrix is symetric, although it should normally be the case for Potts models.
Reading and writing PottsGraph
Reading from a File
The read_graph function reads a PottsGraph from a file. The file should follow the format
J i j a b value # coupling lines
...
h i a value # field lineswhere i and j are sequence positions and a, b are integer states (e.g. representing amino acids).
julia> # Read the graph in the example folder g = read_graph("../../example/toy_potts.dat)ERROR: ParseError: # Error @ 8;;file:///home/runner/work/PottsEvolver.jl/PottsEvolver.jl/docs/REPL#2:46REPL:2:468;; # Read the graph in the example folder g = read_graph("../../example/toy_potts.dat) # └ ── unterminated string literal
Writing to a File
The write function writes the parameters of a PottsGraph to a file.
write("path/to/output.txt", g; sigdigits=5, index_style=1)Basic Functions
Energy of a Sequence
The energy function calculates the statistical energy of a sequence.
julia> seq = AASequence(map(aa_alphabet, collect("ACDEFGHIKLMNPQRSTVWY")))AASequence{Int64}([2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21])julia> E = energy(seq, g_rand)ERROR: ArgumentError: Lengths of sequence and model do not match length(s) == (size(g)).L must hold. Got length(s) => 20 (size(g)).L => 10
Gauge Change
The set_gauge! function allows you to change the gauge of the PottsGraph. Supported gauges include :zero_sum and :lattice_gas.
julia> PottsEvolver.set_gauge!(g_rand, :zero_sum);
julia> all(≈(0; atol=1e-14), sum(g_rand.h, dims=1)) # sum of h values along each sequence position is zero
true
julia> all(≈(0; atol=1e-14), sum(g_rand.J, dims=[1,2])) #same goes for J
trueAdditional Utilities
Profile Model
The profile_model function creates a PottsGraph with only fields that fit the single-site frequencies in a given matrix.
julia> f1 = rand(21, 10); # Example frequency matrix - does not have to be normalized
julia> profile = PottsEvolver.profile_model(f1; pc=1e-2);
julia> all(≈(0; atol=1e-14), profile.J)
true