PottsGraph

The PottsGraph structure represents a Potts model, a statistical model used to describe interactions in a system with multiple states (e.g., amino acids in a protein sequence). It is defined by the following fields:

J: A 4-dimensional array of dimensions q x q x L x L representing the coupling parameters between states at different positions.
h: A 2-dimensional array of dimensions q x L representing the local field parameters for each state at each position.
β: The inverse temperature parameter, used during sampling.
alphabet: An optional Alphabet object to map states to characters (e.g., amino acids), see the BioSequenceMappings.jl package.

q represents the number of states (e.g., 21 for amino acids + gap), and L is the length of the sequence.

Structure and Fields

The PottsGraph is defined as follows:

@kwdef mutable struct PottsGraph{T<:AbstractFloat}
    J::Array{T,4}
    h::Array{T,2}
    β::T = 1.0
    alphabet::Union{Nothing, BioAequenceMappings.Alphabet{Char,<:Integer}} = aa_alphabet
end

Creating a PottsGraph

You can create a PottsGraph with null or random initialization:

julia> using PottsEvolver
julia> L, q = (10, 21);
julia> g_null = PottsGraph(L, q, Float64; init=:null); # null initialization
julia> g_null.J[:, :, 1, 2]21×21 Matrix{Float64}:
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 ⋮                        ⋮              ⋱       ⋮                        ⋮
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
julia> g_rand = PottsGraph(L, q, Float64; init=:rand); # random initialization
julia> size(g_rand)(L = 10, q = 21)
julia> g_rand.J[:, :, 1, 2] # coupling matrix between sites 1 and 221×21 Matrix{Float64}:
 -0.0603865  -0.06204    -0.202254    …  -0.0920158    0.0503226
 -0.0880537   0.154888   -0.0368273       0.0788336    0.200912
  0.0364512   0.0151703   0.059547       -0.112366     0.0233217
 -0.0959824   0.0844823   0.0500079      -0.0398984    0.015367
 -0.314799   -0.0834199   0.0858495      -0.287309    -0.0437886
 -0.0635159  -0.0100393  -0.0519691   …  -0.00200401  -0.0711762
  0.0219989   0.111479   -0.141196        0.0283503    0.0683627
  0.143414   -0.0159089  -0.0261254       0.0441965   -0.184584
  0.0123456  -0.118256   -0.0486653      -0.0878733   -0.0406157
 -0.117116   -0.0529542  -0.130584        0.0630611    0.163666
  ⋮                                   ⋱                ⋮
 -0.0384241   0.0127512  -0.0372773      -0.0366409   -0.0687255
 -0.0645968  -0.153036    0.0849313      -0.039283     0.159733
 -0.0635798  -0.0799232  -0.0800636      -0.0858292   -0.0274848
 -0.171622    0.19772    -0.0999202   …   0.0293537   -0.0537473
 -0.190267    0.0126827   0.0906458       0.0712207   -0.00468208
  0.205619    0.147886   -0.0482052       0.0669406    0.0906053
 -0.0533296   0.0544218  -0.0560534      -0.179928     0.125887
 -0.0165422  -0.0362424  -0.00131298     -0.00564719  -0.0366123
  0.149738   -0.172874    0.0584519   …   0.121888    -0.00718466

It is sometimes easier to view the couplings J as a square matrix.

julia> J_matrix = PottsEvolver.couplings_as_matrix(g_rand.J);

julia> size(J_matrix) # (L*q x L*q)
(420, 420)

julia> J_matrix ≈ J_matrix' # The coupling matrix should be symetric
true

julia> J_tensor = PottsEvolver.matrix_as_couplings(J_matrix, L, q); # back to tensor

julia> J_tensor ≈ g_rand.J
true

Note

There is no check that the coupling matrix is symetric, although it should normally be the case for Potts models.

Reading and writing `PottsGraph`

Reading from a File

The read_graph function reads a PottsGraph from a file. The file should follow the format

J i j a b value # coupling lines
...
h i a value # field lines

where i and j are sequence positions and a, b are integer states (e.g. representing amino acids).

julia> # Read the graph in the example folder
       g = read_graph("../../example/toy_potts.dat)ERROR: ParseError:
# Error @ 8;;file:///home/runner/work/PottsEvolver.jl/PottsEvolver.jl/docs/REPL#2:46REPL:2:468;;
# Read the graph in the example folder
g = read_graph("../../example/toy_potts.dat)
#                                           
└ ── unterminated string literal

Writing to a File

The write function writes the parameters of a PottsGraph to a file.

write("path/to/output.txt", g; sigdigits=5, index_style=1)

Basic Functions

Energy of a Sequence

The energy function calculates the statistical energy of a sequence.

julia> seq = AASequence(map(aa_alphabet, collect("ACDEFGHIKLMNPQRSTVWY")))AASequence{Int64}([2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21])
julia> E = energy(seq, g_rand)ERROR: ArgumentError: Lengths of sequence and model do not match
length(s) == (size(g)).L must hold. Got
length(s) => 20
(size(g)).L => 10

Gauge Change

The set_gauge! function allows you to change the gauge of the PottsGraph. Supported gauges include :zero_sum and :lattice_gas.

julia> PottsEvolver.set_gauge!(g_rand, :zero_sum);

julia> all(≈(0; atol=1e-14), sum(g_rand.h, dims=1)) # sum of h values along each sequence position is zero
true

julia> all(≈(0; atol=1e-14), sum(g_rand.J, dims=[1,2])) #same goes for J
true

Additional Utilities

Profile Model

The profile_model function creates a PottsGraph with only fields that fit the single-site frequencies in a given matrix.

julia> f1 = rand(21, 10);  # Example frequency matrix - does not have to be normalized

julia> profile = PottsEvolver.profile_model(f1; pc=1e-2);

julia> all(≈(0; atol=1e-14), profile.J)
true