Directed tree structured Markov networks can be easily transformed into an efficient product-form distribution, represented by a “cluster tree”. The tree width is the maximum number of variables within a single cluster minus one. If the tree is a path, then the path width is the tree width when decomposed into a chain graph. $w^* \leq pw^* \leq \dots$
Given a directed tree graph G, produce a product-form distribution by taking the product, starting at the root of G.
$p(1,\dots,7) = p(3)p(1|3)p(2|3)p(4|3)p(5|4)p(6|4)p(7|4)$
Alternately, the product-form distribution can be constructed by multiplying the joint distribution of each clique together, divided by their intersections. $p(1,\dots,7) = \frac{p(1,3)p(2,3)}{p(3)} * \frac{p(3,4)}{p(3)} * \frac{p(4,5)}{p(4)} * \frac{p(4,6)}{p(4)} * \frac{p(4,7}{p(4)}$
This method has the benefit that it applies to any graph of cliques arranged in a tree structure.
For a tree-decomposition with edge between $B_j$ and $B_i$, $sep(i,j) = \text{scope}(B_I) \cap \text{scope}(B_J)$
The separator-width is the maximum separator in the tree.
For a tree-decomposition with edge between $B_j$ and $B_i$, $elim(i,j) = \text{scope}(B_I) - sep(i,j) = \text{scope}(B_I) - \text{scope}(B_I) \cap \text{scope}(B_J)$
So we desire to decompose cyclical graphs into a tree structure. Chordal I-maps are decomposable. The fill-in algorithm can test for chordality and triangulate
or make the graph chordal.
Cutset conditioning can produce a tree-decomposition from a dependency graph.
This is a method for constructing a join-tree a.k.a “junction tree”. Decompose a graphical model $\mathcal{M} = <X,D,F,\Pi>$, its scopes $S = S_1,\dots,S_r$, and its primal graph $G=(X,E)$, where X = {$X_1,\dots,X_n$} and F = {$f_1,\dots,\f_r$}. Output is a join-tree decomposition.
##* In reverse-variable order: ##*# add the clique to the tree by connecting it to another node with which it share the largest subset of variables
Simpler version:
Once a join tree is created, “Messages” can be passed between clusters by marginalizing over non-shared variables. Each cluster will originate with a funtion of its member variables, but will then receive “messages” in the form of functions over sub-sets of variables shared with adjacent clusters. After receiving all messages, a joint marginal can be computed from the product of all of these functions within the cluster.
Generalization of Bucket-Tree Elimination, Cluster-Tree Elimination or CTE is a message-passing algorithm for producing a tree with [http://en.wikipedia.org/wiki/Explicit_and_implicit_methods model explicit] clusters. So each cluster becomes an independent representation of the variables within its scope.
Convergence is guaranteed, and occurs when no new messages are created. If processing is performed from leaves to root, convergence is guaranteed after two passes.
Time complexity: $O((r+N) * \text{degree} * k^{w+1})$ Space complexity: $O(N k^{\text{max separator size}})$
A tree-decomposition can be transformed to reduce separator-width by combining nodes separated by the greatest width. Cluster-tree elimination applied to the resulting graph will have time complexity $O(m * \text{degree} * e^{r})$ and space $O(n * e^{s})$, where $r$ is the maximal number of nodes in a cluster and $s$ is the separator-width of the resulting decomposition.