GOBNILP[Benchmarks] [Papers using GOBNILP]
GOBNILP (Globally Optimal Bayesian Network learning using Integer Linear Programming) is a C program which learns Bayesian networks from local scores. The GOBNILP distribution provides a separate C program to generate BDeu local scores from complete discrete data. GOBNILP uses the SCIP framework for Constraint Integer Programming. Detailed information on installing, running, licence etc can be found in the:
which is also contained in the GOBNILP distribution.
GOBNILP has been developed by James Cussens and Mark Bartlett. It is supported by MRC grant A graphical model approach to pedigree construction using constrained optimisation (G1002312). Goblin 'P' logo courtesy of Robby Higgins.
GOBNILP has been developed using a set of local score files used originally in the paper Bayesian network learning with cutting planes. Most of these local score input files are in this gzipped tar file. The local score input files alarm, mirna, phen and wdbc were kindly sent to me by Tommi Jaakkola, the others I generated myself. More recently a couple of local score input files (developed by Mark Bartlett) for learning pedigrees have been used as part of the Pedigree Reconstruction using Integer Linear Programming project. GOBNILP 1.2 and upwards contains software for generating local scores from data. Where available the original data and the local scores generated from it are available from this directory. (This directory also contain some local score files where the data from which they were generated is not available.)
The following table provides a summary of benchmarking results for GOBNILP 1.2 and GOBNILP 1.3 using SCIP 2.1 (the recommended version of SCIP) and default parameter settings. All times are in seconds. The GOBNILP output for each of these runs is available from this directory. (Benchmark results for older GOBNILP versions are available here.) 'BN' is the name of the BN from which the data was originally generated. 'p' is the number of variables in the original dataset (and thus in the optimal BN). 'n' is the number of datapoints in the data. 'palim' is the limit on the size of parent sets in any learned BN. '|PaSets|' is the number of candidate parent sets generated by GOBNILP's local scoring algorithm - this is the size of the input to the main GOBNILP learning algorithm. 'Scoring' is the time taken to produce the local scores. 'Finding' is the time taken to find an optimal BN. 'Solving' is the time taken to solve the problem (find an optimal BN and prove that it is optimal.) A blank entry in 'Finding' indicates that GOBNILP failed to solve the problem. In such cases, if GOBNILP made at least some progress the 'Solving' column records how large the 'gap' was after the indicated number of seconds. If GOBNILP made no progress, then 'Solving' has the entry FAIL.
Benchmarks run on a single core of the following machine: Linux 3.0.18-x86_64 #1 SMP PREEMPT Thu Jan 26 12:54:10 GMT 2012 x86_64 Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux. CPLEX 12.1.0.0 was used as the LP solver.
| BN | p | n | palim | |PaSets| | Scoring | Finding (1.2) | Solving (1.2) | Finding (1.3) | Solving (1.3) |
|---|---|---|---|---|---|---|---|---|---|
| Diabetes | 413 | 100 | 2 | 4441 | 1135 | 16800(1.23%) | 2259 | 2259 | |
| Diabetes | 413 | 1000 | 2 | 21493 | 2754 | 7954 (117.71%) | 93420 (6.58%) | ||
| Diabetes | 413 | 10000 | 2 | 262129 | 5497 | 2215 (357.07%) | 8213 (46.09%) | ||
| Link | 724 | 100 | 2 | 7431402 | 411 | FAIL | FAIL | ||
| Link | 724 | 1000 | 2 | 2379361 | 1150 | FAIL | FAIL | ||
| Link | 724 | 10000 | 2 | 2421283 | 9005 | FAIL | FAIL | ||
| Mildew | 35 | 100 | 3 | 3520 | 17 | 4 | 4 | 2 | 2 |
| Mildew | 35 | 1000 | 3 | 161 | 64 | 1 | 1 | 1 | 1 |
| Mildew | 35 | 10000 | 3 | 463 | 167 | 1 | 2 | 2 | 3 |
| Pigs | 441 | 100 | 2 | 2692 | 151 | 1217 | 1217 | 100 | 100 |
| Pigs | 441 | 1000 | 2 | 15847 | 203 | 55236 (0.13%) | 2226 | 2226 | |
| Pigs | 441 | 10000 | 2 | 304219 | 2082 | 1319 (13.06%) | 1025 (3.35%) | ||
| Water | 32 | 100 | 3 | 482 | 1 | 3 | 3 | 5 | 5 |
| Water | 32 | 1000 | 3 | 573 | 1 | 9 | 9 | 6 | 6 |
| Water | 32 | 10000 | 3 | 961 | 7 | 20 | 20 | 26 | 26 |
| alarm | 37 | 100 | 3 | 907 | 1 | 3 | 3 | 3 | 3 |
| alarm | 37 | 1000 | 3 | 1928 | 3 | 8 | 8 | 5 | 5 |
| alarm | 37 | 10000 | 3 | 6473 | 16 | 117 | 1702 | 797 | 836 |
| asia | 8 | 100 | ∞ | 41 | 1 | 1 | 1 | 1 | 1 |
| asia | 8 | 1000 | ∞ | 112 | 1 | 1 | 1 | 1 | 1 |
| asia | 8 | 10000 | ∞ | 169 | 1 | 1 | 1 | 1 | 1 |
| carpo | 60 | 100 | 3 | 5068 | 5 | 519 | 872 | 709 | 960 |
| carpo | 60 | 1000 | 3 | 3827 | 14 | 120 | 138 | 98 | 100 |
| carpo | 60 | 10000 | 3 | 16391 | 99 | 2373 | 2529 | 1063 | 1346 |
| hailfinder | 56 | 100 | 3 | 244 | 20 | 1 | 1 | 1 | 1 |
| hailfinder | 56 | 1000 | 3 | 761 | 39 | 7 | 7 | 9 | 9 |
| hailfinder | 56 | 10000 | 3 | 3768 | 124 | 143 | 216 | 63 | 92 |
| insurance | 27 | 100 | 3 | 279 | 1 | 1 | 1 | 1 | 1 |
| insurance | 27 | 1000 | 3 | 774 | 1 | 2 | 2 | 4 | 4 |
| insurance | 27 | 10000 | 3 | 3652 | 5 | 22 | 22 | 10 | 10 |
| kredit | 18 | 1000 | 3 | 70 | 1 | 1 | 1 | ||
| kredit | 18 | 1000 | ∞ | 70 | 248 | 1 | 1 | ||
| alarm | 37 | 5630 | 11 | 11 | |||||
| mirna | 22 | 1279 | 3 | 3 | |||||
| phen | 25 | 2296 | 11 | 11 | |||||
| wdbc | 31 | 5921 | 24 | 24 | |||||
| pedigree1 | 100 | 1195 | 11 | 11 | |||||
| pedigree2 | 100 | 1195 | 11 | 11 |
Last modified: Wed Apr 3 10:33:39 BST 2013