MCMCMS 0.3.4
User Guide

Nicos Angelopoulos                 James Cussens
Department of Computer Science
University of York, York, UK
{nicos,jc}@cs.york.ac.uk

Abstract

We describe a collection of Prolog programs which implements a generic method for applying MCMC over model structures defined by Stochastic Logic Programs (SLPs). In broad terms, an SLP is used to constructively define all possible statistical models that may explain some input data and also to define a prior distribution over these models. MCMC is then run over these possible models by making moves on the tree defining all computation paths of the SLP. The methodology is generic in two ways. Firstly, in being applicable to any statistical model space that can be expressed as an SLP and secondly, in allowing for a variety of alternative proposal moves. For the former, the sole prerequisite is that we have the means of computing the likelihood of each generated model: that is, for computing the probability of data given the model.

Contents

1  Introduction
    1.1  Acknowledgements
2  Tutorial
    2.1  Simple concept learning
    2.2  A different prior
    2.3  Changing the likelihood
    2.4  A mixture prior
    2.5  Using continuous distributions
3  Syntax
4  Files and directories
    4.1  Files
    4.2  Directories
    4.3  C Code recompilation
    4.4  Development and bug reports
5  Running experiments
    5.1  Backtracking proposals
6  Model spaces
    6.1  BNs
        6.1.1  Displaying BNs
    6.2  RPDAGs
    6.3  C&RTs
        6.3.1  BCW and K
        6.3.2  PIMA
    6.4  Pedigrees
7  New model spaces
    7.1  Likelihood
    7.2  Displaying models
    7.3  Writing SLPs
8  Other features
    8.1  Sampling
    8.2  Global Priors Ratio
    8.3  Tempering
A  Options for run/1
B  Arguments for run_data/11
C  Failure Model (fm) universe vs. Real Model (rm) universe
        C.0.1  FM- sampling
        C.0.2  FM- MCMC
        C.0.3  RM- sampling
        C.0.4  RM- MCMC
    C.1  SLP Program
    C.2  Transformed Program

1  Introduction

MCMCMS (Markov chain Monte Carlo over Model Structures) is a collection of programs which implement the ideas introduced in [ CussensCussens2000], and further analysed in [ Angelopoulos  CussensAngelopoulos  Cussens2001]. The programs were initially developed under the EPSRC research grant Induction of Stochastic Logic Programs, November 2000-October 2001. Currently the development is supported by EPSRC's MATHfit programme, under the grant Stochastic Logic Programs for MCMC, September 2003-August 2005. The software can be downloaded from http://www.cs.york.ac.uk/~nicos/sware/slps/mcmcms/ while most of the papers can be downloaded from http://www.cs.york.ac.uk/~jc.
To date we have run experiments over Bayesian networks (BNs), Restricted Acyclic Partially Directed Graphs (RPDAGs), pedigrees and Classification and Regression Trees (C&RTs). The methodology is quite generic and can be applied to any model space, provided an SLP with the desired prior is constructed, and a method for computing the likelihood of constructed models is incorporated.
The remainder of this guide is structured as follows: Section 2 is a tutorial tour of the software. Section 4 describes the basic directory structure and the main files. Section 5 shows how to run MCMC experiments over SLPs. Section 6 gives information on the model spaces for which MCMCMS has been used so far. Section 7 details how the programs can work over new spaces. Finally Section 8 briefly reviews some features of the system which are still to be documented fully.

1.1  Acknowledgements

The development of this software has been supported by two EPSRC grants: Induction of Stochastic Logic Programs, November 2000-October 2001 and Stochastic Logic Programs for MCMC, September 2003-August 2005.

2  Tutorial

This section aims to show how to use MCMCMS to perform Bayesian inference using very simple examples: the goal here is entirely pedagogic and so unrealistically simple examples are used. Real examples of using MCMCMS can be found in Section 6. All the files needed to run the examples described in this tutorial can be found in the tutorial directory of the MCMCMS distribution. The output files which running these examples should give you can be found in the file tutorial_out.tgz which is available from the MCMCMS webpage: http://www.cs.york.ac.uk/~nicos/sware/slps/mcmcms/.

2.1  Simple concept learning

Our first example is the `concept learning' problem illustrated in Fig 1. In concept learning the assumption is that there is an unknown target (or `true') concept which is simply a subset of some data space. Data consists of positive examples (examples in the target concept) and negative examples (examples outside the target concept). There is also a hypothesis space: sets which the learning system can hypothesise to be the true concept.
Consider a very simple concept learning scenario where the hypothesis space contains a mere five hypotheses, each of which is a rectangle (Fig 1). The data consists of two negative examples (black dots in Fig 1) and a single positive example (the white dot). Clearly, only the b and c rectangles are consistent with the data.
exs.png
Figure 1: Concept learning with five hypotheses and three examples
In MCMCMS the user is required to define a prior over the hypothesis space using an SLP. Fig 2 shows an SLP defining one possible prior over our tiny hypothesis space. This SLP can be found in tutorial/slps/tabular.slp. (This filename, and all others in this section are given relative to the top level of MCMCMS once you have unpacked it.)
0.1 :: hyp(rectangle(a,1,2,3,4)).
0.1 :: hyp(rectangle(b,1,5,9,6)).
0.3 :: hyp(rectangle(c,5,6,6,7)).
0.3 :: hyp(rectangle(d,2,1,8,3)).
0.2 :: hyp(rectangle(e,4,2,7,9)).

Figure 2: SLP defining a prior over a hypothesis space of five rectangles (in the file tutorial/slps/tabular.slp).
Each rectangle is represented by a term in first-order logic which gives the name of the rectangle and the co-ordinates of the bottom-left and top-right corners respectively. Hypotheses can be represented as the user sees fit: the important thing, as we shall see below, is to choose a representation that makes it easy to compute likelihoods.
A full explanation of how SLPs define probability distributions can be found, for example, in [ CussensCussens2001]. In this simple example, each probability label in Fig 2 gives a prior probability for the corresponding rectangle. The choice of predicate symbol hyp/1 is arbitrary.
Similarly, the choice of data representation is a matter of convenience, although using anything other than Prolog might involve a lot of extra work. Fig 3 shows how we have chosen to represent the three data points of Fig 1: we just give co-ordinates and the class for each datapoint. This data can be found in tutorial/data/exs.pl
datum((5.5,6.5),pos).
datum((5.5,2.5),neg).
datum((6.5,8.5),neg).

Figure 3: Data for concept learning (in the file tutorial/data/exs.pl).
Now that prior and data have been determined we need to consider the likelihood function. To see what sort of likelihood function needs to be defined, consider the form of Bayes theorem relevant to concept learning. Each datapoint Ei is of the form (Xi,Ci), where Xi is the datapoint's position and Ci is its class. A hypothesis H is only required to predict an example's class given its position. It need not make any predictions concerning positions. (In statistical parlance, the co-ordinates giving the position of the example are called either explanatory variables, predictor variables or covariates.) Let X and C represent the vectors of positions and classes of examples, respectively. In the current case this means that X=((5.5,6.5),(5.5,2.5),(6.5,8.5)) and C=(pos,neg,neg). The posterior P(H|X,C) for any hypothesis H can then be calculated as follows:
P(H|X,C) = P(H|X)P(C|H,X)

P(C|X)
(1)
We will assume that each hypothesis attaches a class label to an example independently of other examples we have:
P(C|H,X) =
&Pi
i 
P(Ci|H,Xi)
(2)
The LHS of (2) is the likelihood function for this scenario. Since each hypothesis H classifies examples categorically, it is easy to see that each term on the RHS of (2) is either 1, if H classifies (Xi,Ci) correctly, and 0 otherwise. It follows that the overall likelihood P(C|H,X) is 1 if H is consistent with all datapoints and 0 otherwise. Note that the prior P(H|X) is only prior to the classes of the examples, their positions can influence the prior. In the current scenario, as Fig 2 shows, this potential influence is not used. See the work on C&RT described in Section 6.3 for a case where the values of predictor variables in the data do affect the prior.
Code for computing likelihoods (in fact, log-likelihoods) is shown in Fig 4. The first clause of predicate model_llhood/2 declares that a Model (i.e. a hypothesis) has log-likelihood -&infin (i.e. likelihood 0) if there exists an example (X,C) which is inconsistent with the model. If there are no inconsistent examples, then a log-likelihood of 0 (i.e. a likelihood of 1) is returned by the second clause. Note the use of a cut (!) to ensure that model_llhood/2 defines a function. This code can be found in the file tutorial/simple_concept_learning.pl.
model_llhood(Model,-inf) :-          % Model has zero likelihood if ..
        datum(X,C),                  % .. there exists a datapoint ..
        inconsistent(C,X,Model),     % .. which is inconsistent with Model
        !.
model_llhood(_Model,0).              % hypothesis consistent with the data

inconsistent(pos,(XX,XY),rectangle(_Name,X1,Y1,X2,Y2)) :-
        (XX<X1 ; XX>X2 ; XY<Y1 ; XY>Y2).       % pos outside the rectangle
inconsistent(neg,(XX,XY),rectangle(_Name,X1,Y1,X2,Y2)) :-
        XX>=X1 , XX=<X2 , XY>=Y1 , XY=<Y2.     % neg inside the rectangle

Figure 4: Computing the log-likelihood for rectangular hypotheses (in the file tutorial/simple_concept_learning.pl).
The likelihood is the place where the representations of the hypothesis space and the data meet. What is essential is that the definition of model_llhood/2 is consistent with whatever hypothesis and data representations are chosen.
Since MCMCMS uses the Metropolis-Hastings algorithm it does not require the log-likelihoods directly. All that is needed is that, given any two hypotheses M* and Mi, the likelihood ratio is computed. In our current example this ratio is simply P(C|M*,X)/P(C|Mi,X). In some cases it is possible to compute this ratio without going to the trouble of computing the individual likelihoods (or log-likelihoods). However, in our current case there is no such cleverness, we compute the ratio by computing log-likelihoods and then computing the likelihood ratio. The code for doing this is in Fig 5 and can be found in tutorial/simple_concept_learning.pl. Note that LLi is instantiated at the time lhood_canonical/5 is called, so lhood_canonical/5 only needs to compute LLst, the log-likelihood of MSt, to get the likelihood ratio.
lhood_canonical(Mst,_Mi,LLst,LLi,Ratio) :-
        model_llhood(Mst,LLst),  % compute log-likelihood of Mst
        Ratio is exp(LLst - LLi).

Figure 5: Code for computing likelihood ratios (in the file tutorial/simple_concept_learning.pl)
To run MCMCMS, both model_llhood/2 and lhood_canonical/5 must be defined by the user. However, if lhood_canonical/5 does not use model_llhood/2 then the latter can be given a `dummy' definition. See Section 7.1 for further details.
Now that prior, data and likelihood have been defined we can run MCMCMS to produce a sample from the posterior. To do this we start up Prolog (either Sicstus or Yap) and load the file tutorial/simple_concept_learning.pl which is displayed in Fig 6.
:- ensure_loaded( '../run.pl' ).

run_test :- run([runs_file('runs/concept_learning_run'),
             results_dir('out/simple_concept_learning_out')]).

exs.   %do nothing with the data, except load it from data/exs.pl

lhood_canonical(Mst,_Mi,LLst,LLi,Ratio) :-
        model_llhood(Mst,LLst),  % compute log-likelihood of Mst
        Ratio is exp(LLst - LLi).


model_llhood(Model,-inf) :-          % Model has zero likelihood if ..
        datum(X,C),                  % .. there exists a datapoint ..
        inconsistent(C,X,Model),     % .. which is inconsistent with Model
        !.
model_llhood(_Model,0).              % hypothesis consistent with the data

inconsistent(pos,(XX,XY),rectangle(_Name,X1,Y1,X2,Y2)) :-
        (XX<X1 ; XX>X2 ; XY<Y1 ; XY>Y2).       % pos outside the rectangle
inconsistent(neg,(XX,XY),rectangle(_Name,X1,Y1,X2,Y2)) :-
        XX>=X1 , XX=<X2 , XY>=Y1 , XY=<Y2.     % neg inside the rectangle

Figure 6: Main file (tutorial/simple_concept_learning.pl) for simple concept learning.
Most of the code in tutorial/simple_concept_learning.pl concerns the likelihood and has already been discussed. As for the rest, the first line loads in the main MCMCMS file ../run.pl. The predicate run_test/0 is there just to abbreviate the given call to the run/1 predicate. run/1 is a MCMCMS predicate (i.e. the user does not define it). The user calls run/1 with a list of options to perform a given MCMC run. These options are described in detail in Appendix A. In this case we just supply a runs_file and specify a directory for output.
A runs_file should be a Prolog file which defines a predicate run_data/11which sets parameters for a number of runs of the system. Fig 7 shows the run_data/11code relevant to our current example. This code can be found in tutorial/runs/concept_learning_run.pl. Detailed documentation for run_data/11can be found in Appendix B. The code given in Fig 7 says that the prior is defined in the file tutorial/slps/tabular.slp, where the slps directory is relative to the `main' file for running MCMCMS, that the predicate hyp/1 defines the prior, that data is in data/exs.pl (again data is relative to the main file) and that the run should be for 100000 iterations.
run_data(
         1,     % id
         uc,    % backtrack strategy
         tabular,       % SLP for prior in slps/tabular.slp
         rm,    % real models, space
         exs,   % (i) 'exs.' is called and (ii) data/exs.pl is loaded
         hyp/[],% hyp(X) is the call to generate a hypothesis X
         tab,   % 'tab' added to output filenames
         100000,% length of Markov chain
			[],    % hot chains identifiers- here none
         1,     % selects first random seed from ../auxil/seeds.pl
         all    % do all post-processing operations, if any
        ).

Figure 7: run_data/11fact for concept learning (found in the file tutorial/runs/concept_learning_run.pl).
The rm parameter instructs that only `real' models should be considered. Whenever a failure is encountered, Prolog's backtracking is used to find the nearest model. Alternatively to this space, when fm is used, failures are reported at the top level of the MCMC and the algorithm registers a non-jump iteration. For an example illustrating the differences between rm and fm see [ Angelopoulos  CussensAngelopoulos  Cussens2004,p.5].
The exs parameter plays a dual role. Firstly, this parameter determines the name of a file to load. Unless this file name is null, it is an error if the file does not exist or is not readable. This file will generally contain the data (it does in our current example), although this is not mandatory. As long as lhood_canonical/5 is defined to give the correct likelihood ratios, MCMCMS does not mind how the data is found. Secondly, the goal exs is called. This is a `hook' which can be used to do e.g. data pre-processing. As can be seen from Fig 6, in this case exs is defined to do nothing. If exs/0 were not defined the program would still run, but a warning would be generated.
Once tutorial/simple_concept_learning.pl has been compiled, we kick off the run by simply entering run_test. at the Prolog prompt. Once the call to run_test/0 has completed the data from the run can be found in tutorial/out/simple_concept_learning_out. Recall that out/simple_concept_learning_out was the subdirectory specified in the results_dir option given to run/1. Fig 8 shows what the MCMCMS run looks like. As Fig 8 indicates, the run/0 call actually performs two MCMC runs since there are two run_data/9 facts in tutorial/runs/concept_learning_run.pl. The second experiment will be discussed in Section 2.2.
pc275 ~/research/slps/mcmcms-0_1_1/tutorial sicstus
SICStus 3.11.1 (x86-linux-glibc2.3): Fri Feb 20 18:38:25 CET 2004
Licensed to cs.york.ac.uk
| ?- compile(simple_concept_learning).
% compiling /u5/jc/research/slps/mcmcms-0_1_1/tutorial/simple_concept_learning.pl...
%  compiling /u5/jc/research/slps/mcmcms-0_1_1/run.pl...
%   loading /usr/local/pkg/sicstus-3.11.1/lib/sicstus-3.11.1/library/system.po...
%   module system imported into user
%    loading foreign resource /usr/local/pkg/sicstus-3.11.1/lib/sicstus-3.11.1/library/x86-linux-glibc2.3/system.so in module system
%   loaded /usr/local/pkg/sicstus-3.11.1/lib/sicstus-3.11.1/library/system.po in module system, 0 msec 27736 bytes
%   loading /usr/local/pkg/sicstus-3.11.1/lib/sicstus-3.11.1/library/lists.po...%   module lists imported into user
%   loaded /usr/local/pkg/sicstus-3.11.1/lib/sicstus-3.11.1/library/lists.po in
module lists, 0 msec 13776 bytes
%   compiling /u5/jc/research/slps/mcmcms-0_1_1/src/init_lib.pl...
%    compiling /u5/jc/research/slps/mcmcms-0_1_1/src/lib/pl.pl...
%    compiled /u5/jc/research/slps/mcmcms-0_1_1/src/lib/pl.pl in module user, 10 msec 4612 bytes
%    compiling /u5/jc/research/slps/mcmcms-0_1_1/src/lib/sics39_pncms.pl...
yes
| ?- run_test.
 
running_experiment(1)
Successful extensions: ; end of extensions.
ok('gzip -f out/simple_concept_learning_out/tr_uc_rm_exs_tab_i100K__s1')
 
running_experiment(2)
Successful extensions: ; end of extensions.
ok('gzip -f out/simple_concept_learning_out/tr_uc_rm_exs_r2304_i100K__s1')
 
ok('cp /u5/jc/research/slps/mcmcms-0_1_1/tutorial/runs/concept_learning_run.pl /u5/jc/research/slps/mcmcms-0_1_1/tutorial/out/simple_concept_learning_out')
yes
| ?-

Figure 8: Running MCMCMS. sicstus, compile(simple_concept_learning). and run_test. are user input.
The models (in this case rectangles) visited during the MCMC run are saved in the file tr_uc_rm_exs_tab_i100K__s1.gz. Note that the filename is constructed from the various options defined using run_data/11. This file is simply a (gzipped) text-file trace of the models visited and so is 200001 lines long. (There is one extra initial line to record the initial model.) In the trace file lines starting with a `b' indicate proposed models and those with a `c' indicate the current model.
Fig 9 shows the start of the trace file which we produced for this MCMCMS run. Rectangle d is the initial model (it is chosen using the prior). Rectangle c is then proposed and as the third line shows it is accepted and thus becomes the current model. Rectangle a is then proposed but the proposal is not accepted, so rectangle c remains the current model. In fact, inspection of the trace shows that the chain remains stuck at rectangle c until the 22nd iteration.
c(rectangle(d,2,1,8,3)).
b(rectangle(c,5,6,6,7)).
c(rectangle(c,5,6,6,7)).
b(rectangle(c,5,6,6,7)).
c(rectangle(c,5,6,6,7)).
b(rectangle(a,1,2,3,4)).
c(rectangle(c,5,6,6,7)).

Figure 9: The start of a trace produced by an MCMCMS run (found in the file tutorial/out/simple_concept_learning_out/tr_uc_rm_exs_tab_i100K__s1.gz).
In this current tiny example it is trivial to pull out a simple posterior from the trace. Fig 10 shows how to pull out all the counts in the trace file. The last three lines of Fig 10 show the models actually visited (as opposed to merely proposed). The posterior probabilities estimated from the trace for rectangles b and c are 0.248 and 0.752, respectively, close to the correct values of 0.25 and 0.75. Note that d gets an estimated posterior of 1/200001 = 5 ×10-5. By chance it happened to be the initial model. A small burn-in (throwing away an early part of the trace) would be enough to delete it entirely. The first five `b' lines of Fig 10 show that models (rectangles) are being proposed according to the prior distribution.
zcat tr_uc_rm_exs_tab_i100K__s1.gz | sort | uniq -c
   9934 b(rectangle(a,1,2,3,4)).
   9924 b(rectangle(b,1,5,7,8)).
  29943 b(rectangle(c,5,6,6,7)).
  30091 b(rectangle(d,2,1,8,3)).
  20108 b(rectangle(e,4,2,7,9)).
  24671 c(rectangle(b,1,5,7,8)).
  75329 c(rectangle(c,5,6,6,7)).
      1 c(rectangle(d,2,1,8,3)).

Figure 10: Extracting counts from the MCMC trace file

2.2  A different prior

In the previous section there were only six hypotheses, so the use of MCMC was entirely unnecessary, except for tutorial purposes. In this section we start to move towards more realistic examples. Fig 11 shows an alternative prior over a hypothesis space of 8 ×8 ×6 ×6 = 2304 rectangles. Rectangles are sampled from this prior by first choosing the X and Y values for the bottom left corner of the rectangle, and then choosing the horizontal and vertical lengths of the rectangle. The definitions of x_dist/1 and y_dist/1 give a bias towards smaller rectangles. The definitions of bottom_left_x/1 and bottom_right_x/1 mean that (2,3) and (6,7) are a priori the most probable locations for the bottom left-hand corner. Note that now all rectangles get the same name: dummy. We keep this dummy argument in our representation of hypotheses to save us the bother of re-writing the likelihood functions. The code displayed in Fig 11 can be found in tutorial/slps/rects2304.slp.
hyp(rectangle(dummy,X1,Y1,X2,Y2)) :-
        bottom_left_x(X1),
        bottom_left_y(Y1),
        x_dist(XD),
        y_dist(YD),
        X2 is X1 + XD,
        Y2 is Y1 + YD.


0.1 :: bottom_left_x(1).
0.2 :: bottom_left_x(2).
0.1 :: bottom_left_x(3).
0.1 :: bottom_left_x(4).
0.1 :: bottom_left_x(5).
0.2 :: bottom_left_x(6).
0.1 :: bottom_left_x(7).
0.1 :: bottom_left_x(8).

bottom_left_y(Y) :-
        bottom_left_x(X),
        Y is X + 1.

0.3 :: x_dist(1).
0.2 :: x_dist(2).
0.2 :: x_dist(3).
0.1 :: x_dist(4).
0.1 :: x_dist(5).
0.1 :: x_dist(6).

y_dist(D) :-
        x_dist(XD),
        D is XD + 1.

Figure 11: An SLP for a simple prior over 2304 rectangles (found in the file tutorial/slps/rects2304.slp).
To use this new prior it suffices to have an appropriate run_data/11fact in the runs file. Fig 12 shows the relevant fact which can be found in tutorial/runs/concept_learning_run.pl. This is identical to the run_data/11fact shown in Fig 7, except that it points to a different file for locating the SLP prior (and uses a different id and output filename string).
run_data(
         2,     % id
         uc,    % backtrack strategy
         rects2304,     % SLP for prior in slps/rects2034.slp
         rm,    % real models, space
         exs,   % (i) 'exs.' is called and (ii) data/exs.pl is loaded
         hyp/[],% hyp(X) is the call to generate a hypothesis X
         r2304, % 'r2304' added to output filenames
         100000,% length of Markov chain
			[],    % hot chains identifiers- here none
         1,     % selects first random seed from ../auxil/seeds.pl
         all    % do all post-processing operations, if any
        ).


Figure 12: Run information for using a more complex prior (found in the file tutorial/runs/concept_learning_run.pl).
Doing:
zcat tr_uc_rm_exs_r2304_i100K__s1.gz - grep '^c' - sort - uniq -c - sort -rn
where tr_uc_rm_exs_r2304_i100K__s1.gz is the output file, gives us the frequency with which each rectangle was visited. Fig 13 shows that the most frequently visited rectangles (and hence those with highest estimated posterior probability) are as expected. They are consistent with the data and had high prior probability. However, the least visited rectangle rectangle(dummy,6,4,8,7) is not consistent with the data so it might seem odd that it was visited at all-after all it was visited 29 times which leads to an estimated posterior probability of 0.00029 rather than the correct value of 0. What has happened is that, by chance, rectangle(dummy,6,4,8,7) is the initial model. For the first 28 iterations another inconsistent model was proposed, and the proposal is rejected. On the 29th iteration rectangle(dummy,3,4,9,8), a consistent rectangle, is eventually proposed. From that point on the chain visits only consistent models. Again, this shows the importance of discarding (for real applications) an initial `burn-in' part of the sample.
   1715 c(rectangle(dummy,4,3,6,7)).
   1606 c(rectangle(dummy,5,3,6,7)).
   1288 c(rectangle(dummy,5,6,6,8)).
   1259 c(rectangle(dummy,5,3,7,7)).
   1178 c(rectangle(dummy,4,6,6,8)).
   1169 c(rectangle(dummy,2,6,8,8)).
   1073 c(rectangle(dummy,2,3,8,7)).
   1070 c(rectangle(dummy,5,5,6,7)).
   1067 c(rectangle(dummy,2,3,6,7)).
   1056 c(rectangle(dummy,4,3,7,7)).
...
     82 c(rectangle(dummy,1,5,6,10)).
     49 c(rectangle(dummy,1,6,6,12)).
      4 c(rectangle(dummy,6,4,8,7)).

Figure 13: Highest and lowest counts using the more complex prior

2.3  Changing the likelihood

A slightly more realistic version of concept learning is to assume that each concept (in our case, each rectangle) probabilistically (or `noisily') classifies examples as positive or negative. This means that a different likelihood function must be used. For example, the likelihood function given in Fig 14 sets P(Ci|H,Xi) = 0.8 if Xi in inside H and Ci = pos or if Xi in outside H and Ci = neg. Otherwise P(Ci|H,Xi) = 0.2. The code displayed in Fig 14 can be found in tutorial/noisy_concept_learning.pl. As before, we assume that the examples are classified independently so that to get logP(C|H,X) we add the log-likelihoods of each example. In the interests of clarity, the code in Fig 14 is very inefficient. Each time that model_llhood/2 is called the same list of examples Data is constructed and sent to plod_thru_data/4. This is a case where some pre-processing, perhaps via exs/0, would be more efficient.
Fig 14 contains a clause for the exec_extension/2 predicate. Users can use such clauses to define post-processing utilities which are run prior to the trace file being gzipped. The two declarations above the clause definition are required. In Fig 14 a Unix shell command for producing counts is composed using the predicate flatoms/2 and then called using the Prolog built-in shell/1. When exec_extension/2 is called, the variable File is instantiated to the name of the output file. flatoms/2 is a predicate which MCMCMS uses internally, but is also available to the user. flatoms/2 concatenates a list of Prolog atoms together: its definition can be found in mcmcms/src/lib/flatoms.pl.
:- ensure_loaded( '../run.pl' ).

run_test :- run([runs_file('runs/concept_learning_run'),
             results_dir('out/noisy_concept_learning_out')]).

exs.   %do nothing with the data, except load it from data/exs.pl

lhood_canonical(Mst,_Mi,LLst,LLi,Ratio) :-
        model_llhood(Mst,LLst),  % compute log-likelihood of Mst
        Ratio is exp(LLst - LLi).

model_llhood(Model,LL) :-
        findall(dat(XX,XY,C),datum((XX,XY),C),Data),
        plod_thru_data(Data,Model,0,LL).

plod_thru_data([],_Model,OutLL,OutLL).
plod_thru_data([dat(XX,XY,C)|T],Model,InLL,OutLL) :-
        (
          inconsistent(C,XX,XY,Model)
        ->
          MidLL is log(0.2) + InLL
        ;
          MidLL is log(0.8) + InLL
        ),
        plod_thru_data(T,Model,MidLL,OutLL).

inconsistent(pos,XX,XY,rectangle(_Name,X1,Y1,X2,Y2)) :-
        (XX<X1 ; XX>X2 ; XY<Y1 ; XY>Y2).       % pos outside the rectangle

inconsistent(neg,XX,XY,rectangle(_Name,X1,Y1,X2,Y2)) :-
        XX>=X1 , XX=<X2 , XY>=Y1 , XY=<Y2.     % neg inside the rectangle

:- multifile( exec_extension/2 ).
:- dynamic( exec_extension/2 ).

exec_extension(counts,File) :-
        flatoms(['sort ', File, '| uniq -c >', File, '.counts'],Com),
        shell(Com).

Figure 14: The file tutorial/noisy_concept_learning.pl, including a likelihood for `noisy' concept learning.
The counts produced by the MCMCMS run for the simple `six-hypothesis' prior and this `noisy' likelihood can be found in Fig 15. Recall that the `b' counts record the frequency with which hypotheses are proposed and the `c' counts record the frequency with which MCMCMS actually visited each hypothesis. The `b' counts in Fig 15 are similar to those found in Fig 10. This is as expected: rectangles are being proposed using the prior which is the same in both runs. Turning to the hypotheses actually visited, in Fig 10 only rectangles b and c are present (except for an initial visit to d) because only these two rectangles have non-zero likelihood. In contrast, in Fig 15 b and c have the highest frequencies, but the other rectangles are still visited: because the likelihood no longer rules them out entirely, they still have some posterior probability of being the `true' rectangle.
   9934 b(rectangle(a,1,2,3,4)).
   9924 b(rectangle(b,1,5,7,8)).
  29943 b(rectangle(c,5,6,6,7)).
  30091 b(rectangle(d,2,1,8,3)).
  20108 b(rectangle(e,4,2,7,9)).
   5347 c(rectangle(a,1,2,3,4)).
  21740 c(rectangle(b,1,5,7,8)).
  66187 c(rectangle(c,5,6,6,7)).
   4044 c(rectangle(d,2,1,8,3)).
   2683 c(rectangle(e,4,2,7,9)).

Figure 15: Counts from an MCMCMS run with with the tabular.slp prior and a `noisy' likelihood function (found in the file tutorial/out/noisy_concept_learning_out/tr_uc_rm_exs_tab_i100K__s1.counts).

2.4  A mixture prior

Because of the nature of the prior distributions defined by SLPs, it is very easy to formulate mixture priors. Fig 16 contains our original hyp/1 prior over rectangles, a new circ_hyp/1 prior over circles and a mixture prior mix_hyp/1 which mixes the hyp/1 and circ_hyp/1 priors to give a distribution over rectangles and circles. The likelihood function must now be extended to compute the likelihood of circles: see Fig 17 to see how this is done. Since the mixture prior has a bias towards circles, circles are visited more frequently when MCMCMS is run.
0.4 :: mix_hyp(Rectangle) :- hyp(Rectangle).
0.6 :: mix_hyp(Circle) :- circ_hyp(Circle).


circ_hyp(circle((X,Y),Radius)) :-
        bottom_left_x(X),
        bottom_left_y(Y),
        radius(Radius).

0.4 :: radius(0.2).
0.3 :: radius(0.3).
0.3 :: radius(0.4).

hyp(rectangle(dummy,X1,Y1,X2,Y2)) :-
        bottom_left_x(X1),
        bottom_left_y(Y1),
        x_dist(XD),
        y_dist(YD),
        X2 is X1 + XD,
        Y2 is Y1 + YD.


0.1 :: bottom_left_x(1).
0.2 :: bottom_left_x(2).
0.1 :: bottom_left_x(3).
0.1 :: bottom_left_x(4).
0.1 :: bottom_left_x(5).
0.2 :: bottom_left_x(6).
0.1 :: bottom_left_x(7).
0.1 :: bottom_left_x(8).

bottom_left_y(Y) :-
        bottom_left_x(X),
        Y is X + 1.

0.3 :: x_dist(1).
0.2 :: x_dist(2).
0.2 :: x_dist(3).
0.1 :: x_dist(4).
0.1 :: x_dist(5).
0.1 :: x_dist(6).

y_dist(D) :-
        x_dist(XD),
        D is XD + 1.

Figure 16: Mixture prior over rectangles and circles (found in the file tutorial/slps/mixture.slp).
:- ensure_loaded( '../run.pl' ).

run_test :- run([runs_file('runs/mixture_concept_learning_run'),
             results_dir('out/mixture_concept_learning_out')]).

exs.   %do nothing with the data, except load it from data/exs.pl

lhood_canonical(Mst,_Mi,LLst,LLi,Ratio) :-
        model_llhood(Mst,LLst),  % compute log-likelihood of Mst
        Ratio is exp(LLst - LLi).

model_llhood(Model,LL) :-
        findall(dat(XX,XY,C),datum((XX,XY),C),Data),
        plod_thru_data(Data,Model,0,LL).

plod_thru_data([],_Model,OutLL,OutLL).
plod_thru_data([dat(XX,XY,C)|T],Model,InLL,OutLL) :-
        (
          inconsistent(C,XX,XY,Model)
        ->
          MidLL is log(0.2) + InLL
        ;
          MidLL is log(0.8) + InLL
        ),
        plod_thru_data(T,Model,MidLL,OutLL).


inconsistent(pos,XX,XY,circle((X,Y),R)) :-
        SqD is (XX-X)**2 + (XY-Y)**2,
        R2 is R**2,
        R2 < SqD.

inconsistent(neg,XX,XY,circle((X,Y),R)) :-
        SqD is (XX-X)**2 + (XY-Y)**2,
        R2 is R**2,
        R2 >= SqD.

inconsistent(pos,XX,XY,rectangle(_Name,X1,Y1,X2,Y2)) :-
        (XX<X1 ; XX>X2 ; XY<Y1 ; XY>Y2).       % pos outside the rectangle

inconsistent(neg,XX,XY,rectangle(_Name,X1,Y1,X2,Y2)) :-
        XX>=X1 , XX=<X2 , XY>=Y1 , XY=<Y2.     % neg inside the rectangle

:- multifile( exec_extension/2 ).
:- dynamic( exec_extension/2 ).

exec_extension(counts,File) :-
        flatoms(['sort ', File, '| uniq -c >', File, '.counts'],Com),
        shell(Com).

Figure 17: The file tutorial/mixture_concept_learning.pl which includes additional clauses to compute `noisy' likelihoods for circle hypotheses

2.5  Using continuous distributions

The examples so far have all concerned entirely discrete spaces. Continuous distributions can be implemented by including calls to Prolog built-ins which generate floating point values according to various distributions. (In some cases it may be necessary to call C code if the Prolog you are using does not have built-ins for the distributions you need. See Section 4.3 for pointers to the C language interface.)
In Fig 18 there is a prior that adapts the mixture prior (Fig 16) so that the radii of circles is uniformly distributed over the range [0,1]. random/1 is a predicate of library(random) in both Sicstus and Yap.
0.4 :: mix_hyp(Rectangle) :- hyp(Rectangle).
0.6 :: mix_hyp(Circle) :- circ_hyp(Circle).


circ_hyp(circle((X,Y),Radius)) :-
        bottom_left_x(X),
        bottom_left_y(Y),
        radius(Radius).

radius(X) :-
        random(X).   % random/1 is a Sicstus Prolog built-in. 


hyp(rectangle(dummy,X1,Y1,X2,Y2)) :-
        bottom_left_x(X1),
        bottom_left_y(Y1),
        x_dist(XD),
        y_dist(YD),
        X2 is X1 + XD,
        Y2 is Y1 + YD.


0.1 :: bottom_left_x(1).
0.2 :: bottom_left_x(2).
0.1 :: bottom_left_x(3).
0.1 :: bottom_left_x(4).
0.1 :: bottom_left_x(5).
0.2 :: bottom_left_x(6).
0.1 :: bottom_left_x(7).
0.1 :: bottom_left_x(8).

bottom_left_y(Y) :-
        bottom_left_x(X),
        Y is X + 1.

0.3 :: x_dist(1).
0.2 :: x_dist(2).
0.2 :: x_dist(3).
0.1 :: x_dist(4).
0.1 :: x_dist(5).
0.1 :: x_dist(6).

y_dist(D) :-
        x_dist(XD),
        D is XD + 1.



Figure 18: A mixture prior with a (truncated) uniform distribution over the radii of circles (found in the file tutorial/slps/continuous.slp).

3  Syntax

Initially an SLP clause was a Prolog clause with the addition of probability label. This label had to be a number between 0 and 1. For instance the program:
1/2 :: nat( 0 ).
1/2 :: nat( s(X) ) :-
         nat( X ).

fits a geometric distribution over the natural numbers. Posing the query ?- nat( Y ). 10,000 times will instantiate variable Y to each number with probability similar to that shown in Fig. 19. In the plot, the y-axis plots probability for each instantiation while the x-axis plots the ith instantiation term starting from i=1, Y = 0.
figs/natdistro.png
Figure 19: Geometric distribution over naturals.
The restriction to probabilistic labels that are known numbers has been proven too strong in practice. In MCMCMS we allow for functions involving a number or variables and which are evaluated at call-time. For example the following program defines a uniform distribution over element selections from a list.
:- pvars( umember(L,_E,_R), [T-length(L,T)] ). 

1/L       : [L] : umember( El, [El|T] ).
1 - (1/L) : [L] : umember( El, [_|T] ) :-
                [L-1] : umember( El, T ).

The pvars/2 directive is optional. When it is present, umember(X,[a,b,c]) is a valid query, whereas the query [3] :: umember(X,[a,b,c]) can be used both when the directive is present and when it is absent.
Returning to the definition of the predicate, note that the base case is selected with probability 1/L and the recursive case with probability 1 - 1/L. When L is the length of the list in the second argument, this predicate will select an element from the list with uniform distribution. An example of this happening is shown in Fig. 20. The probability with which each element is selected, is equal to the product of the edges in the path from the root to its leaf. In the case of Y=b, p(Y=b) = 2/3 * 1/2 = 1/3. Similarly for the other cases.
figs/umember_sld.png
Figure 20: Selecting with uniform distribution.
For the purposes of MCMC the query umember(X,[a,b,c]) creates a maximum of 3 choice points. For example when X=b the path is [2-2/3,1-1/2], where 1 and 2 are internal clause indices assigned order to the clauses of umember/2 from top to bottom in the order they appear in the source file. The number of, probablitistic, choice points is a crucial dimension in the selection of a bactrack point as described in Section 5.1, also see [ Angelopoulos  CussensAngelopoulos  Cussens2004].
It is possible to reduce the number of choice points umember(X,[a,b,c]) creates by using the declaration
:- s_tail_recursive( umember/2 ).
When this is present, the above query will always create a single choice point: 1-1/3 in the case of X = a and 2-2/3 otherwise. The base case clause of the predicate should appear at the top, followed by the recursive clause(s). Note that the declaration is only effective when the body of a recursive clause does not leave any probabilistic choice-points.
The directive, :- s_no_bpoint( Spec ). forces the probabilistic backtrack points of predicate with specification Spec to be ingnored. Only and all the points created at calls to this predicate are ignored. Points left by the body of clauses of such a predicate to other predicates are valid backtracking choices.

4  Files and directories

Once unpacked, the programs can be loaded and run in the Prolog engines supported. Currently these are, the SICStus and Yap systems. MCMCMS was developed on and is supported in the Linux architecture. We use the Slackware distribution, it may thus be necessary to recompile some foreign code used in computing the likelihoods as described in Section 4.3.

4.1  Files

There are two files at the top-level. mcmcms.pl is the main entry point for the MCMC algorithm. It defines mcmcms/11 which only experienced users should call directly. Instead users are expected to call run(Opts), where Opts is a list specifying options. Recognised options are listed in Appendix A. run/1 is defined in the top-level file run.pl so this file should always be loaded. run/1 runs a number of MCMC experiments according to its call options. One of these options specifies a runs_file where a definition for the predicate run_data/11 is expected. A description of run_data/11 is given in Appendix B.

4.2  Directories

Directories at the top-level are divided between generic and model specific. The former have code that implement the MCMC algorithm and assist in making the running of experiments an easier task, while the latter hold programs that are only relevant to individual or classes of model spaces. In adddition directory pbl contains scripts for re-running experiments that have appeared in various epublications.
Directories in pbl/:
uai01/ Experiments from [ Angelopoulos  CussensAngelopoulos  Cussens2001]
ijcai05/ Experiments from [ Angelopoulos  CussensAngelopoulos  Cussens2005]
The generic directories are:
auxil/ Auxiliary programs such as creating counts and visits for the chain. Also for sampling random models over an SLP. See Section 8.1.
src/ The Prolog source files that implement the MCMC algorithm. Includes a lib/ directory of supporting predicates.
The model specific directories are:
bns/ Files only pertinent to MCMC over BNs. It is worth noting that most SLPs, in directory slps, are failure-free.
bns/rpdags Programs for running MCMCMS over Restricted Acyclic Partially Directed Graphs. This is a class approximating Markov equivalence classes of BNs. It shares many programs with BNs, thus it is a subdirectory of bns/.
carts/ Programs for running experiments over C&RT models. Two radically different priors and two data sets have been explored.
tutorial/ Files for the tutorial described in Section 2.
In each of the model specific directories, there might exist subdirectories similar to the top-level generic ones: src/ and auxil/. In addition the following directories have been used:
data/ This is where programs try to find the (training) data.
runs/ Repository of scripts. Example files defining run_data/11, which is used by run/1.
slps/ Where SLPs are looked for. Generally speaking, we keep all the files for constructing models here. These will normally also include the Prolog programs on which the SLPs are based.
See files Readme.txt within each model-specific directory for details on running experiments with the respective models. Also within the directories (bns/, bns/rpdags/, carts/bcw/ and carts/kyphosis/ e.t.c.) the user can find examples of Prolog code that uses run/1 (see files run_test.pl and run_disp.pl). These files ensure appropriate files for experiment-runs, likelihood, backtracking proposal, displaying and post-processing, are loaded before run/1 is called. The run_disp.pl version allows visualisation of current and proposed models. run_test and run_disp put their output files into out_test and out_disp respectively. In few cases we have created run_graphs scripts. These are normally identical to out_test but for creating a pair of plots for each experiment run. These are the stays and visits plots of the chain, which are simple indicators of how much the chain moved in the space of models.

4.3  C Code recompilation

MCMCMS is developed in systems that run the Slackware distribution of Linux. If you need to run on a different distribution or operating system, you might need to recompile the glue that enables Prolog to call the logamma function from C language libraries. This should be a straightforward operation, which is documented in files bns/src/llhood/logamma_sicstus.pl and bns/src/llhood/logamma_yap.pl. The logamma function is used in computing the likelihood of proposed models given the data. In recent versions of Yap we use the built-in function lgamma/1 so there is no need for linking to the C libraries.

4.4  Development and bug reports

If you develop you own SLPs or want to run your own experiments on the provided SLPs, by far the cleanest and easiest way is to create a new directory. The structure of this directory should follow the structure of the provided experiment directories. Create subdirectories slps/, data/ and runs/. Copy or create the appropriate files to these directories. Finally built scripts such as run_test.pl and run_disp.pl to run your experiments.
If things do not go as expected, you can add the line :- bb_put( sr, off ). to the top of your script file. When re-executing the run, everything will be printed to the terminal. Failing everything else do not hesitate to send us a message. Preferably, make a tar.gz copy of your directory and put the file in a public location. Then send us an email with the location (see title page for our emails). Please also follow this procedure if you wish to file a bug report.

5  Running experiments

Predicate run/1, defined in mcmcms/run.pl can be used to run a number of experiments producing a number of output files. We will refer to calls of run/1 as a `run'. Each run will execute a number of experiments according to the arguments of run_data/11. For each clause of run_data/11 the MCMC algorithm will be run with the corresponding parameters (see Appendix B). As a result a number of files will be created for each experiment. All filenames associated with a single experiment share a common stem, reflecting the parameters of the experiment. The extension of each file, reveals the kind of output stored in it. The core extentions are:
.gz the gzipped main output of the experiment. This usually records the current and proposed models at each iteration. Possibly along with their likelihoods and model size characteristics.
.stats execution statistics and the internal form of the SLP used are recorded here.
.stays for each model the chain moves to, this records the number of iterations the chain remains stuck at that model. You must load the file auxil/create_visits.pl to get this.
.visits the total number of times a single model appears in the chain. You must load the file auxil/create_visits.pl to get this.
Additional post-processing on the pre-gzipped output file can be effected by defining exec_extension(Ext,Stem). For instance see auxil/create_mgraphs.pl or Fig 14 in the tutorial. Other aspects of the behaviour for run/1 are controlled by the list of options given as its argument (see Appendix A). Each `run' prints some simple diagnostics on how things are proceeding. These are written on the error stream which is connected to the terminal.
Throughout the model specific directories we have used simple Prolog programs which (a) make sure all the appropriate files are loaded, and (b) call run/1. For example consider file bns/run_test.pl in Table 1. The first directive loads run.pl from top-level directory. The second, loads the likelihood to be used and the third the code for displaying Bayesian networks (only included here for illustration). The last four ensure_loaded directives, load predicates that post-process the output files. The commented out directive bb_put( sr, off ) can be added to force all output to the terminal; this is useful for debugging. Finally run_test/0, is a wrap for calling run_test/1 with run_data/11 defined in runs/simple.pl creating output at `points' 24, 28 and 29, and putting all resulting files into directory test/. (See Appendix A for further information on the use of output `points'.)
:- ensure_loaded( '../run' ).
:- ensure_loaded( 'src/llhood/lhood_canonical_uai01' ).
:- ensure_loaded( 'src/display_model' ).
:- ensure_loaded( '../auxil/create_graphs' ).
:- ensure_loaded( '../auxil/create_visits' ).
:- ensure_loaded( 'auxil/create_counts' ).
:- ensure_loaded( 'auxil/create_blankets' ).

data_format(data(_,_,_,_,_,_,_,_)). % needed by lhood.

% :- bb_put( sr, off ).

run_test :-
   Runs = [
      runs_file('simple'),
      print( [24,28,29] ),
      results_dir(test)
   ],
   run( Runs ).
   
Table 1: bns/run_test.pl

5.1  Backtracking proposals

As noted in [ Angelopoulos  CussensAngelopoulos  Cussens2001], a query over an SLP defines a stochastic SLD-tree. Each leaf of the tree corresponds to a model. The MCMC can then be viewed as a walk between leaves of the tree. The method by which a proposed model is reached from the current model defines a proposal distribution. A wide variety of proposals are supported, provided that some weak conditions are met. In the MCMC used here, we have experimented with a number of proposal distributions. Because all proposals over the SLD-tree are concerned with finding a backtrack point on the path of the current model, and then sampling forward from that point, we call these backtracking proposals. To make the process clearer consider the general scenario in Fig. 21.
jump.png
Figure 21: Reaching from Mi in the SLD-tree.
Given that the process is at Mi a proposal M* needs to be reached. This is done by choosing one of the backtrack points (Bj) from the path defined by the root of the tree to Mi. Once such a point (Bi) has been chosen, M* is reached by sampling forward according to the prior. In this section we detail the different backtracking proposals available in MCMCMS.
In [ Angelopoulos  CussensAngelopoulos  Cussens2001] a cyclic probability (CP) backtracking strategy was followed. The algorithm is parametrised by the expected depth Depth and its formulated as follows:
00. Let j := 1.
10. Let pb = 1 - 2-j
20. Backtrack one step.
25. If at the top of the tree then go to 2, otherwise with probability pb backtrack one more point and repeat 1, or, with probability 1 - pb , go to 2.
28. Sample for a new leaf M* by selecting branches with probability proportional to clause labels. On the first choice, do not descend to the branch leading to Mi (branch Ci in Fig. 21).
30. Let k : = j + 1. If k > D then j : = 1 else j : = k
40. Unless iterations limit reached Goto 10.
A simpler version of CP uses a constant, non-cyclic, pb. In the software this strategy is identified by pb(PB).
One of the shortcomings of CP is that is dependant on Mi . For deterministic strategies, strategies that do not depend on Mi , the calculation of a, the acceptance probability, can be simplified. We have therefore also implemented a cyclic deterministic (CD) strategy. It is again parametrised by a number; the expected Distance. The algorithm is much simpler. It simply cycles j from 1 to Distance, increasing by one at each MCMC iteration. Backtracking chooses Bj, the choice point j levels up from the leaf Mi , at each proposal.
A third scheme (UC, for uniform choice) randomly picks with a uniform distribution between all choice points between Mi and the root.
Each of the three backtracking proposals can be chosen by selecting the appropriate term for the second argument of run_data/11(see Appendix B) which controls the corresponding experiment. bp(Bp) is the backtracking probabilistically proposal with backtracking probability equal to Bp, cp(D) is the cyclic probabilistic backtracking with Depth = D, cd(D) is the deterministic strategy with Distance = D and uc for the uniform choice strategy. We have tried to make the backtracking proposal as modular as possible. If you wish to implement a different proposal you should see file src/bp.pl.

6  Model spaces

We have constructed SLPs and likelihood functions for three model spaces: Bayesian networks (BNs), restricted acyclic partially directed graphs (RPDAGs) and classification and regression trees (C&RT)s. The following three subsections give details on how to use these programs.

6.1  BNs

BNs are well researched statistical models. In particular, functions for computing the likelihood of models have been known for some time. Therefore, they were chosen to be the first model space tackled in our framework. Our work relates to that of [ Friedman  KollerFriedman  Koller2000]. However, the benefit of using SLPs for defining the whole model space is that complicated and precise priors including hard constraints and statistically expressed preferences can be defined. This is an area with very little attention in the literature, where it is normal to assume a uniform, uninformative, prior.
BN related files are stored in mcmcms/bns/. The generator SLPs are in slps/ and it is worthwhile noting that most of them have failure-free versions. This is a particularly important for the BP and CP proposals (Section 5.1) which were what we originally used with these SLPs. You can run queries run_test. and run_disp. from files run_test.pl and run_disp.pl for small demonstrations. The likelihood used was:

&Pi
j Î ui 
G(aij)

G(aij + Nij)

&Pi
k Î Dom(Xi) 
G(aijk + Nijk)

G(aijk)
(3)
and it can be found in bns/src/llhood/lhood_canonical_uai01.pl. (For details see [ Angelopoulos  CussensAngelopoulos  Cussens2001].) Various auxiliary programs for manipulating BNs and the generated output can be found in bns/auxiliary/.

6.1.1  Displaying BNs

This is not an essential part of MCMCMS but it is pretty nifty to be able to visualise all those BNs. You need to install graphviz http://www.research.att.com/sw/tools/graphviz/download.html then either add to TCLLIBPATH the location where Tcldot resides, or copy/move this file, somewhere where your tcl looks for lib files. If you are using latest graphviz (currently 1.10) you also need to copy file $prefix/share/graphviz/demo/doted to a directory in your $PATH. If you are using an older version, such as 1.7, make sure that the wish command in your (bash) $PATH is version 8.3 or older.
To display the current BN and the proposed one, use the display(next_display_at) run/1 option. For example run( [display(next_display_at), id(2)] ). This will display the first pair and then ask for a number. Pressing return, or 1 and then return, will display the next pair as well. An integer larger than 1 means display nth step from here, 0 halts Prolog and a negative number -N will skip to the Nth jump from the current position.
To display individual BNs, represented as Prolog terms use,
           cd auxil; prolog
           ?- [disp_bn].
           ?- disp_asia.

which should display the ASIA BN. Nodes are colour coded. Red, for nodes with no parents, green for nodes with no children, but with parents, and orange for nodes with both children and parents. Other BNs can be displayed with disp_bn( BN ). See auxil/disp_bns.pl for the term representation of BNs.
To display bns sampled independently use
?- ['auxil/disp_sampled_bns.pl']. For example
?- disp_sbns( bn_un_2p, 10, donot_stop, [a,b,c,d,e,f,g,h] ).
displays 10 bns without waiting for user interaction in the between. The arguments of disp_sbns/4 are as follows:
SLP basename for SLP to be used. Looked for in `.' and ../slps/. The .slp extention is not necessary.
HowMany number of bns to generate.
Stop set to stop for pausing, on user-interaction, after each generated BN.
Nodes the nodes for the generated BNs.
To compare a learned model against a generator BN (here ASIA is the assumed default generator) do
           cd auxil ; prolog
           ?- [disp_cut_off].
           eg (after you have run run_test. above)
           ?- disp_cut_off( 0.95, '../test/tr_op0.8_or_3p_i1K_all_s1.counts' ).

The generic use is ?- disp_cut_off( CutOff, File ). The first argument is in the range 0 £ CutOff £ 1 which means that the `learned' model is constructed by taking an edge to be in the model iff the Times the edge appeared over the total number of Iterations is greater that the CutOff point, i.e. Times/Iterations > CutOff. The edges in the contrasting BN are coloured as follows
red for edges learned correctly
yellow for false negative (edges present in original only)
pink for false positive (edges present in learned only)
The second argument, File, should point to a .counts file.

6.2  RPDAGs

Programs dealing with Restricted Acyclic Partially Directed Graphs (RPDAGs) [ Acid  de CamposAcid  de Campos2003] are in bns/rpdags since they share a lot of code with BNs. RPDAGs are a compromise between BNs and their equivalence classes. While learning, one ideally will like to learn in the space of equivalence classes since they represent a set of BNs whose structure explains the data equally well. However, it is computationally more expensive to move around in the space of equivalence classes. In response, have proposed RPDAGs. They have the property that many BNs from a single class map to a single RPDAG. It is though also the case, that more than one RPDAGs may map to a single equivalence class. Computationally, RPDAGs are harder to construct than BNs, but easier than Markov equivalence classes.
The SLP generator of all RPDAGs can be found in rpdags/rpdag.slp while a Prolog version can be found in rpdag.pl. These are based on the additive operations presented in [ Acid  de CamposAcid  de Campos2003,p.467,Table 2]. The number of RPDAGs according to our programs are shown in Table 2. Also in the table are shown the number of BNs, second row, and number of equivalence classes, in the last row. These are taken from [ Gillispie  PerlmanGillispie  Perlman2001]
Nodes 1 2 3 4 5 6 7
BNs 1 3 25 543 29281 3781518 1138762420
RPDAGs1 2 13 258 13521 1713008 496236097
Eq.Class1 211185 8782 1067825 312510571
Table 2: Number of RPDAGs in relation to BNs and equivalence classes.
Postscript files detailing all RPDAGs for three and four nodes can be found in directories rpdags/doc/three.ps.gz and rpdags/doc/four.ps.gz respectively.
The likelihood used is an extension of the likelihood used in the case of BNs. Similarly, RPDAGs are displayed with the same routines as BNs. These have been extended to display unidirectional arcs as links of black colour. To create a small Markov chain you can use query ?- run_test. defined in file run_test.pl. To visualise the models while creating the same chain use ?- run_disp. from file run_disp.pl.
RPDAGs are represented by terms of the form: rpdag(Nodes,Dag,BiD). Nodes is a list of nodes to node information; a triplet containing the children the parents and the neighbours of the node. Neighbours are connected by bidirectional links. Dag is the Directed Acyclic Graph representing the directed part of the RPDAG. BiD is the graph containing the bidirectional links of the RPDAG. Both Dag and BiD are manipulated using the Prolog ugraph library.

6.3  C&RTs

Classification and Regression trees are binary decision trees that predict a response Y from a set of features X. If Y is qualitative, then the tree is a classification tree, whereas for quantitative Y we get a regression tree.
The programs implementing experiments over C&RTs are in carts/. We have implemented two priors based on work done independently by [ Chipman HChipman H1998], [ Denison H. A.Denison H. A.1998] and later by further developed in [ Denison, Holmes, Mallick,  SmithDenison et al.2002]. Three datasets from the two papers are included: The Wisconsin breast cancer data (BCW), the Kyphosis dataset (K) and the Pima Indians (PIMA) Diabetes dataset (ftp://ftp.ics.uci.edu/pub/machine-learning-databases/pima-indians-diabetes/). BCW contains 16 missing data values. Following [ Chipman HChipman H1998] we have simply deleted datapoints which contain missing values. In both cases the machine learning task is binary classification using integer-valued predictors-nine predictors in the case of BCW, and three for K. All splits are binary: made by splitting on some threshold. There are 683 datapoints for BCW and 81 for K.

6.3.1  BCW and K

Experiments with BCW can be run with programs in carts/bcw/ and with K in carts/kyphosis. For the former we used two different priors, the GROWTREE and EDITTREE priors. The GROWTREE prior can be found in carts/bcw/grow/slps/unify.slp while the EDITTREE prior is in carts/bcw/edit/slps/edit.slp. For K, a much smaller data set, only the GROWTREE is present.
The GROWTREE prior, is essentially that used by [ Chipman HChipman H1998]. The stochastic process which defines a GROWTREE prior grows a C&RT tree by starting with a single leaf node and then repeatedly splits each leaf node h with a probability a(1+dh)-b, where dh is the depth of node h and a and b are prior parameters set by the user to control the size of trees. Unsplit nodes become leaves of the tree. If a node is split, the splitting rule for that split is chosen uniformly. The source is in carts/bcw/grow/slps/unify.slp.
The other type of prior is the EDITTREE prior. Here an `initial' C&RT model is supplied. This can be any tree, for example the tree produced by the standard greedy algorithm or one manually entered by the user. This tree is then probabilistically edited. This approach is closely related to the proposal mechanisms of [ Chipman HChipman H1998, Denison H. A.Denison H. A.1998]. The source is in file carts/bcw/edit/slps/edit.slp where, with probability 4/5 the tree is edited. There are three possible editions: growing the tree by splitting a leaf, pruning the tree by merging neighbouring leaves and changing a splitting rule. Our likelihood, file carts/src/cart_llhood, is implementing [ Chipman HChipman H1998,Eq.16]. C&RTs, are also displayed using disp_bn/1. Undirected links are drawn in black, with no arrows in either direction.

6.3.2  PIMA

The PIMA dataset was added recently and is still under development. It can be found in carts/pima. Canned examples are provided and the more advanced features can be found in directory in_progress. A variety of more advanced GROW like priors were devised for this dataset.

6.4  Pedigrees

We are currently working on constructing pedigree model structures. A pedigree is a special case of a BN where each node has exactly two parents. We have encoded an SLP approximation of the prior used in the Familias program (http://www.math.chalmers.se/ mostad/familias/) [ Egeland, Pettera, Mevåg,  StenersenEgeland et al.2000]. Latest code is in pedigrees/weighed which includes samples from the approximating prior and some simple mcmc runs.

7  New model spaces

To run experiments over different model spaces one needs to: (a) write a generator SLP, (b) write a Prolog program that implements the likelihood function of a model given some input data, and (c) if desired, write programs for displaying the models on screen with graphics software.
You are strongly advised to create trivial scripts similar to run_test.pl and run_disp.pl as done in directories bns/, bns/rpdags/, carts/bcw/grow and carts/kyphosis/. During debugging you can add directive :- bb_put( sr, off ). to the script. This will direct all output to the terminal. The current version of MCMCMS can be found with mcmcms_version/1.

7.1  Likelihood

To actually hook your likelihood function to MCMCMS you need to define
           lhood_canonical( +MODst, +MODi, -LLst, +LLi, -Ratio ).
           and     model_llhood( +Model, -Llhood )
Figs 6, 14 and 17 in the tutorial provide examples of definitions of these predicates. Variables prefixed by + stand for `input' variables which are instantiated at the time of the call, while those prefixed by - are `output' variables which are not instantiated until exit time. MODst is the proposed model, MODi is the current model, LLst is the log-likelihood of MODst, LLi is the log-likelihood of MODi and Ratio is the likelihood ratio of MODst over MODi.
For model_likelihood/2, Model is a model, and Llhood is its log-likelihood. This predicate should always be defined but if it is not called from lhood_canonical/5 then it can instantiate Llhood to any odd non-numeric value, e.g. model_llhood( _, dont_know_it ).

7.2  Displaying models

This is not a necessary step for MCMCMS but it is used when visualisation of the current and proposed models is requested. You simply need to define                display_model( +Title, +Model, +Llhood ).
Title is the title of the model, i.e. where it is the current or the proposal of iteration i. Model is the representation of the model to be displayed, and Llhood is its log-likelihood. When the predicate is entered all three arguments should be instantiated appropriately, with the possible exception of Llhood. This depends on whether model_llhood/2, see preceeding sub-Section, instantiated the likelihood correctly. Your program should be aware and take appropriate action if that is not the case (e.g. not display the log-likelihood).
The main route for displaying models is via the run/1 option. This provides a hookable mechanism, the details of which can be found in the sources: mcmcms/run.pl. Here we document an implemented hook invoked by display(next_display_at). This always prints the first pair of models (current and proposed) and interrogates the user for when the next display should be. The number 1 and any non integer input will result in the display of the next pair of models, while 0 will halt the execution. A positive integer will display the Nth comparison from the current position. A negative integer will skip the Nth jump from the current position.

7.3  Writing SLPs

Strictly non-stochastic predicates are un-labelled (non-stochastic) predicates that only contain strictly non-probabilistic predicates in their bodies. When writing an SLP that contains such predicates it might be more efficient and cleaner if they are placed in a separate source file. The source file can either be compiled as a module file if you are using the SICStus Prolog system, or loaded without translation (on any of the supported engines). In the former scenario, the module file can be compiled in SICStus and then saved with : save_modules( [Mod], <FILENAME>_npbc ) where Mod is the module name. In the latter case, directive :- els(NonPbcSrc) can be used to load source file when SLP translation is to be avoided.
The best way to find out how to run experiments are by looking at the experiment scripts in the distribution. For example pbl/ contains scripts for experimental results which have been published. In scripts that load run.pl or mcmcms.pl, you can use directive :- elm( RelPos ). This directive loads file RelPos whose position is taken relative to the top MCMCMS directory.

8  Other features

Here we briefly review some features of the system which are still to be documented fully.

8.1  Sampling

There are facilities for sampling models from an SLP. Furthermore the output can be displayed in postscript format, and a latex report can be generated. Sampling is not a part of the MCMC algorithms but it can be very useful in providing feedback for newly developed SLPs: in particular, about their digestibility by the software and the distributions they define. For the time being, and until we expand on this section, please look at the source files: auxil/n_samples.pl and auxil/n_samples_tex
A round-about way to sampling is to run MCMC with a constant ratio of likelihood that is equal to something substantially larger than 1. In this way, the chain will almost certainly consist of the proposed models. (i.e. the proposed model is always accepted.) Such a likelihood can be found in auxil/lhood_constant.pl. An example of this technique is in auxil/from_prior_ex/srun.pl.

8.2  Global Priors Ratio

It is possible to view the prior of a model, P(M), as consisting of two parts: global and local. The local is the probability of the branch of computation that leading to M and the global is an factor that can be computed after producing the model. As idicated by the name, we want to use the global part in computing the contribution of global features of the model to the prior. An example of the importance of such global features can be found in the prior used by the Familias program [ Egeland, Pettera, Mevåg,  StenersenEgeland et al.2000]. MCMCMSwill look into memory for the predicate global_priors_ratio/3. If this is present it will be called each time the acceptance probability is computed. The call is: global_priors_ratio( +M* , +Mi , -Ratio ). The user-defined predicate is expected to return Ratio = [(PG(M* ) )/(PG(Mi ) )].

8.3  Tempering

A number of hot chains that are developed in parallel to the main cold chain can be run. Hot chains move on smoother likelihood spaces, so getting less stuck. At each iteration two chains are chosen for a possible swap of their, next-iteration, current models.
Power tempering is the only kind of tempering supported. In the future other kinds might be possible. In power tempering the space is smoothed by raising the likelihood to a power. For hot chain j we sample from stationary distribution pbj(x) µ p0(x)L(x)bj for 0 < bj < 1. Proposals and acceptances within this chain work exactly as for the cold chain. The only difference is that there is a different likelihood. So, if at model Mij we propose model M*j then:
aj(Mij ,M*j ) = min
ì
í
î
1,k(Mij ,M*j ) æ
è
L(M*j )

L(Mij )
ö
ø
bj

 
ü
ý
þ
where k(Mij ,M*j ) is the combined prior and proposal conttribution.
The chain-swap acceptance probability for moves between chains does not depend on the proposal distribution which generates moves within chains. So the acceptance probability is the same for all within-chain proposals, namely:
a(Maj1 ,Maj2 ) = min
ì
í
î
1, é
ë
L(Maj2 )

L(Maj1 )
ù
û
(b1 - b2)

 
ü
ý
þ
where Maj1 is the accepted within-chain model in chain j1.
An example run-script using power tempering can be found in carts/pima/run_hot.pl.

A  Options for run/1

The following are all optional. If an option term is not present the indicated default value is used.
copy(Copy) copies some of the files used to the results directory. The idea is that results directories can be made self explanatory. Copy should be one of `none' or `all' or it should be a list of : `data',`runfile',`script'(Scr) and `slp'. The default value is [runfile]. Note that you should substitute the scriptname used in your run of experiments, for Scr.
display(DMethod) method controlling display of models. Use `next_display_at'. If option is absent no models will be displayed.
id(ID) a number of instances of this option are allowed. Each ID should correspond to the first argument of a run_data/11 clause. If absent, then all experiments described by run_data/11 will be pursued.
initial(M) registers initial model, M. If it is not present the initial model is sampled from the prior. This option is intended for advanced users only. You need to be very careful that your SLP will be able to reach efficiently to the/a proof of the model. (mcmcms/carts/bcw/edit/irun.pl demonstrates the option's usage.)
print(PointsList) points to be recorded to output files. If absent the default value of [24,28,29] is used. Other numbers are undocumented. If more detail is needed, please look at the source files for dbg/2. Point 24 prints the initial model, point 28 prints all subsequent visited models in the chain while 29 prints proposed models. The latter is printed in the form of b(Model) terms, while current models are deposited as c(Model). On how Models are extracted from the top query, see Appendix B, 5th argument (Call).
progress(Atom,Percentile) if present, a progress bar will be displayed for every experiment within the run. Atom will be written once for every completed Percentile of the number of repetitions in each experiment.
results_dir(ResDir) is the directory for the output files from the experiments. Currently if the directory exists, there will be no complaints, and no clean-up. So unless you know what you are doing it is better if the directory does not exist. If ResDir is a path, e.g. of the form dir1/sub1/test, which does not exist, then it will be created. Default value: `test'.
runs_file(RFile) file containing the run_data/11 definitions. Default: `run_test'.
source_dir(SDir) a directory relative to which runs/ slps/ and data/ are looked in for. Default value: `.'.

B  Arguments for run_data/11

1:ID unique id for the experiment. Usually a small positive integer.
2:BP backtracking proposal to be used. One of
bp(P) probabilistic backtracking, with fixed probability; as presented in [ Angelopoulos  CussensAngelopoulos  Cussens2001].
cp(D) cyclic probabilistic backtracking, of focus D; as presented in [ Angelopoulos  CussensAngelopoulos  Cussens2001].
cd(S) cyclic deterministic backtracking; cycling k from 1 to S. Where k=1 is the root of the SLD-tree.
uc uniformly chosen backtracking, which chooses amongst the available points.
3:SLP the location of the source file for the SLP to be used. Extension .slp is not necessary and files will be looked for at current and ./slps/ directories.
4:MSD model space identifier. Can either be set to `rm' or to `fm' (see Appendix C). The former identifies real model spaces- upon failure to reach a model usual Prolog backtracking is employed to explore alternative choices, in a exhaustive shallow-first search. The latter allows failure models to be returned at top level of the MCMC. When this is the case, there will be no attempt to jump to the proposed, imaginary, model. Instead, the count of staying with current model will increase by one.
5:DataCall a callable term. This argument performs two roles. Firstly, file DataName{.pl}, the predicate name of DataCall, is compiled. Secondly, DataCall is called as a Prolog goal before the main MCMC sampling begins. It is therefore, a `hook' which can be useful for any needed pre-processing. DataCall is normally defined in DataName{.pl} and it should at least succeed. It may have shared arguments with 6:Query; so data may provide parameters of the space of all possible models (e.g. C&RT in directory cart/). Optionally, `clean_up_'DataCall may also be defined in DataName{.pl}. This is only necessary when a single run experiment uses more than one data set. See file bns/data/jc_asia_data_dyn_all.pl for an example.
6:Query a Name/Args structure, where Name is a predicate symbol and Args is a list. If L is the length of the list Args then Name/(L+1) is the top-level predicate for the chain and the predicate should be defined in SLP. The last argument of calls to Name/(L+1) will be a fresh variable. The software expects the this variable to be instantiated to a model each time the call to Name/(L+1) succeeds. The first L arguments in calls to Name/(L+1) will be those supplied in the list Args.
7:Contr an atom that will be added to the stem filename of the output files created from this experiment.
8:Repeats a large integer, stating the length of the Markov chain.
9:HotChnsIds a list of hot chain identifiers. For power tampering, which is the only kind currently supported, each identifier should be a power bj with 0 < bj < 1. See Section 8.3.
10:RndSeedIdx a random seeds index: an integer between 1 and 1000, see file auxil/seeds.pl.
11:Exts the extensions to be created from the main output file. This can be used to restrict the files generated from applications of exec_extension/2. The safer value is `all'. Alternatively a list of extensions matching the first argument of defined exec_extension/2 clauses can be given. The empty list, and equivalently `none' are also acceptable values which create no extention files.

C  Failure Model (fm) universe vs. Real Model (rm) universe

Some notes on comparing two different `backtracking' strategies. These, are strategies definiting the model space distributions, as oppose to the proposal backtracking.
The two alternatives examined are: real model space (RM) and failure model space (FM). The former uses native backtracking to always reach a real model. Every time a label is chosen from a list of labels corresponding to matching to unified clauses, a backtrack point is created. Upon failure, the remaining of the labels are tried. (Probabilistictly chosen among the untried ones). The latter does not create such backtrack points. In the event of a failure, the entire call should fail. This is caught by the top level, and the special model `__failure' is outputed. The MCMC will always chose not to jump in this case.
In passing lets note that non-probabilistic predicates need to be (near) deterministic for this to work. To allow more flexibility a special token can be generated instead of failure. This is similar to EM software, but it means that the transformed SLP clauses have to be aware of this special token.
We have tested RM and FM by sampling N (N=10000) independent samples from the prior described in CICLOPS04 ([ Angelopoulos  CussensAngelopoulos  Cussens2004]), and by running N iterations of the MCMC with a constant likelihood. An extra argument has been added to rundata/8 which should be set to `rm' or 'fm'.

C.0.1  FM- sampling

figs/fm_samples_10k.png
Figure 22: Using fm to draw 10000 samples from ciclops 04 prior- observing X after calling p(X).
The frequencies in detail: 'a-4985' 'c-3267' 'failure-1748'

C.0.2  FM- MCMC

figs/fm_mcmc_vis.png
Figure 23: Using FM to do MCMC 10000 iterations from ciclops 04 prior- observing X after calling p(X).

C.0.3  RM- sampling

figs/rm_samples_10k.png
Figure 24: Using fm to draw 10000 samples from ciclops 04 prior- observing X after calling p(X).
The frequencies in detail:
'a-4982 ' 'c-5018 '

C.0.4  RM- MCMC

figs/rm_mcmc_vis.png
Figure 25: Using RM to do MCMC 10000 iterations from ciclops 04 prior- observing X after calling p(X).

C.1  SLP Program

1/2 :: p( a ).
1/2 :: p( X ) :- q(X), t(X).

1/3 :: q( b ).
2/3 :: q( c ).

1 :: t(c).

do_trace( b ) :- trace.
do_trace( c ) :- true.
     

C.2  Transformed Program

:- module( slp, [] ).
:- compile(library('../runtime_fm')).
do_trace(1, [1|A]/B, [1|C]/D, b) :-
        user:trace,
        A=B,
        C=D.
do_trace(2, [2|A]/A, [2|B]/B, c).
p([A|B]/C, [D|E]/F, G) :-
        select_id([[a],[H]], [G], [0.5,0.5], 3, A, B, D, I, J),
        (   I=:=3 ->
            [a]=[G],
            user:true,
            J=C,
            E=F
        ;   [H]=[G],
            q(J/K, E/L, H),
            t(K/C, L/F, H)
        ).
q([A|B]/C, [D|E]/F, G) :-
        select_id([[b],[c]], [G], [0.3333333333333333,0.6666666666666666], 5, 
                A, B, D, H, I),
        (   H=:=5 ->
            [b]=[G],
            user:true,
            I=C,
            E=F
        ;   [c]=[G],
            user:true,
            I=C,
            E=F
        ).
t([A|B]/C, [D|E]/F, G) :-
        select_id([[c]], [G], [1], 7, A, B, D, _, H),
        [c]=[G],
        user:true,
        H=C,
        E=F.
     

References

[ Acid  de CamposAcid  de Campos2003]
Acid, S.  de Campos, L. M. 2003. Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs  Journal of Artificial Intelligence Research, 18, 445-490.
[ Angelopoulos  CussensAngelopoulos  Cussens2001]
Angelopoulos, N.  Cussens, J. 2001. Markov chain Monte Carlo using tree-based priors on model structure  In Breese, J.  Koller, D., Uncertainty in Artificial Intelligence: Proceedings of the Seventeenth Conference (UAI-2001),  16-23 Seattle, USA. Morgan Kaufmann.
[ Angelopoulos  CussensAngelopoulos  Cussens2004]
Angelopoulos, N.  Cussens, J. 2004. On the implementation of MCMC proposals over stochastic logic programs  In Colloquium on Implementation of Constraint and LOgic Programming Systems. Satellite workshop to ICLP'04 Saint-Malo, France.
[ Angelopoulos  CussensAngelopoulos  Cussens2005]
Angelopoulos, N.  Cussens, J. 2005. Exploiting informative priors for bayesian classification and regression trees  In Nineteenth International Joint Conference on Artificial Intelligence Edinburgh, UK. To appear.
[ Chipman HChipman H1998]
Chipman H, George E, M. R. 1998. Bayesian CART model search (with discussion)  Journal of the American Statistical Association, 93, 935-960.
[ CussensCussens2000]
Cussens, J. 2000. Stochastic logic programs: Sampling, inference and applications  In Sixteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-2000),  115-122 San Francisco, CA. Morgan Kaufmann.
[ CussensCussens2001]
Cussens, J. 2001. Parameter estimation in stochastic logic programs  Machine Learning, 44 (3), 245-271.
[ Denison, Holmes, Mallick,  SmithDenison et al.2002]
Denison, D. G. T., Holmes, C. C., Mallick, B. K.,  Smith, A. F. M. 2002. Bayesian Methods for Nonlinear Classification and Regression. Wiley.
[ Denison H. A.Denison H. A.1998]
Denison H. A., Mallick B. K., S. A. F. M. 1998. A Bayesian CART algorithm  Biometrika, 85, 363-377.
[ Egeland, Pettera, Mevåg,  StenersenEgeland et al.2000]
Egeland, T., Pettera, M., Mevåg, B.,  Stenersen, M. 2000. Beyond traditional paternity and identification cases. selecting the most probable pedigree  Forensic Science International, 110 (1).
[ Friedman  KollerFriedman  Koller2000]
Friedman, N.  Koller, D. 2000. Being Bayesian about network structure  In Uncertainty in Artificial Intelligence: Proceedings of the Sixteenth Conference (UAI-2000),  201-210 San Francisco, CA. Morgan Kaufmann Publishers.
[ Gillispie  PerlmanGillispie  Perlman2001]
Gillispie, S. B.  Perlman, M. D. 2001. Enumerating markov equivalence classes of acyclic digraph models  In UAI 2001.



File translated from TEX by TTH, version 3.68.
On 3 Nov 2005, 11:53.