Skip to contents

Background

Classical spatial models for discrete data typically use a Markov random field (MRF), which specifies a joint distribution over an undirected graph. However, the MRF likelihood involves an intractable normalizing constant (the partition function), making exact Bayesian inference computationally burdensome.

The Mixture of Directed Graphical Models (MDGM) is an alternative that takes the same undirected graph as input but defines a mixture over compatible directed acyclic graphs (DAGs). Each DAG admits a tractable factorization, so the MDGM avoids the partition function entirely. The joint distribution marginalizes over the DAG space: p(𝐳𝛏)=Dp(D)p(𝐳D,𝛏)p(\mathbf{z} \mid \boldsymbol{\xi}) = \sum_D p(D)\, p(\mathbf{z} \mid D, \boldsymbol{\xi}).

For full details, see Carter and Calder (2024).

The MDGM prior

Let G=(V,E)G = (V, E) be an undirected graph (the “natural undirected graph” or NUG) encoding potential neighbor relationships. The MDGM places a prior over DAGs DD that are compatible with GG: every directed edge (u,v)(u, v) in DD corresponds to an undirected edge {u,v}\{u, v\} in GG.

Three DAG constructions are supported:

Spanning trees

A spanning tree TT of GG is a connected, acyclic subgraph containing all vertices. Edges are directed from child to parent. The posterior sampling uses Wilson’s algorithm with data-dependent edge weights:

w(u,v)=exp(ψ𝟏[zu=zv])w(u, v) = \exp\bigl(\psi \cdot \mathbf{1}[z_u = z_v]\bigr)

where ψ>0\psi > 0 is the spatial dependence parameter and zz is the color assignment. This provides a direct (non-MH) posterior sample of the spanning tree.

Acyclic orientations

An acyclic orientation assigns a direction to every edge in GG such that no directed cycle exists. Equivalently, this is defined by a vertex permutation σ\sigma: edge {u,v}\{u, v\} is directed as (u,v)(u, v) if σ(u)>σ(v)\sigma(u) > \sigma(v). The MCMC proposes new permutations and accepts via a Metropolis-Hastings step based on the exact DAG log-likelihood ratio.

Rooted DAGs

A rooted DAG is constructed from GG by choosing a root vertex and orienting edges via a breadth-first or depth-first traversal. The MCMC updates the root via random walk proposals on GG.

Spatial field model

Given a DAG DD, each vertex ii has a set of parents paD(i)\text{pa}_D(i). The conditional distribution of the color ziz_i given its parents is:

$$p(z_i = k \mid z_{\text{pa}(i)}, \psi) \propto \exp\Bigl(\psi \sum_{j \in \text{pa}(i)} \mathbf{1}[z_j = k] + \alpha_k\Bigr)$$

where αk\alpha_k are marginal log-probabilities (currently fixed at 0 for a uniform marginal).

Standalone vs. hierarchical models

Standalone model

In the standalone model, the spatial field zz is observed directly. The MCMC updates only the DAG structure DD and the dependence parameter ψ\psi. This is useful when the data are categorical labels on a spatial domain.

Hierarchical model

In the hierarchical model, zz is a latent field and observations yiy_i are generated through an emission distribution:

yijzi,θf(yijθzi)y_{ij} \mid z_i, \theta \sim f(y_{ij} \mid \theta_{z_i})

Currently supported emission families:

  • Bernoulli: yijzi=kBernoulli(pk)y_{ij} \mid z_i = k \sim \text{Bernoulli}(p_k), with identifiability constraint p0<p1<p_0 < p_1 < \cdots enforced via truncated Beta posterior sampling.
  • Gaussian: yijzi=k𝒩(μk,σk2)y_{ij} \mid z_i = k \sim \mathcal{N}(\mu_k, \sigma_k^2), with identifiability constraint μ0<μ1<\mu_0 < \mu_1 < \cdots. Independent Normal and Inverse-Gamma conjugate updates for μk\mu_k and σk2\sigma_k^2.
  • Poisson: yijzi=kPoisson(λk)y_{ij} \mid z_i = k \sim \text{Poisson}(\lambda_k), with identifiability constraint λ0<λ1<\lambda_0 < \lambda_1 < \cdots. Conjugate truncated Gamma updates.

The MCMC additionally updates zz (Gibbs scan over vertices) and the emission parameters (conjugate updates). See the Emission Models vignette for worked examples.

Prior specification

Dependence parameter

The spatial dependence parameter ψ>0\psi > 0 has a half-Cauchy prior:

p(ψ)=2π(1+ψ2),ψ>0p(\psi) = \frac{2}{\pi(1 + \psi^2)}, \quad \psi > 0

Updates use a Metropolis-Hastings random walk with a normal proposal.

Emission parameters

  • Bernoulli: Each pkp_k has a Beta(a,b)\text{Beta}(a, b) prior. emission_prior_params = c(a, b).
  • Gaussian: μk𝒩(μ0,σ02)\mu_k \sim \mathcal{N}(\mu_0, \sigma^2_0) and σk2InverseGamma(α0,β0)\sigma_k^2 \sim \text{InverseGamma}(\alpha_0, \beta_0), independently. emission_prior_params = c(mu_0, sigma2_0, alpha_0, beta_0).
  • Poisson: Each λk\lambda_k has a Gamma(α0,β0)\text{Gamma}(\alpha_0, \beta_0) prior (rate parameterization). emission_prior_params = c(alpha_0, beta_0).

All conjugate posteriors use truncated sampling to enforce parameter ordering for identifiability.

MCMC algorithm

Each iteration of the MCMC sampler performs:

  1. Update DAG DD — Sample a new spanning tree (direct posterior sample via Wilson’s) or propose a new acyclic orientation/root (MH step).
  2. Update zz (hierarchical only) — Gibbs scan over vertices in random order. The full conditional combines the spatial prior with the emission likelihood.
  3. Update ψ\psi — Metropolis-Hastings with normal random walk proposal.
  4. Update θ\theta (hierarchical only) — Conjugate posterior sampling with identifiability constraints.