5.2.1.2 Stability Parameter Tables

Tables 5.5 through 5.8 summarize the
majority of our stability experiments on sparse binary datasets. Each
of these organizes results for ten-fold cross-validation experiments
for combinations of `modelmin` and `modelmax`, `margin`, `rrlambda`, and `cgwindow` and
`cgdecay`. Because the tests are cross-validations, the testing sets for
each fold come from the same distribution as the training set for each
fold.
Each parameter can have one of two values, as shown in
Table 5.4, where one value effectively disables it
and the other is chosen to illustrate that parameter's effect on
computations. The asymmetry seen in the `modelmin` and `modelmax` ``on''
values is due to the asymmetry of the IEEE 756 floating point
representation in which denormalized values allow greater resolution
near zero. Note that `cgwindow` and `cgdecay` are disabled by making them very
large. Unless stated otherwise, the `binitmean` is disabled and `wmargin` is
zero.

Parameter(s) | ``Off'' (-) values | ``On'' (x) values |

modelmin, modelmax |
0.0, 1.0 | 1e-100, 0.99999998 |

margin |
0.0 | 0.001 |

rrlambda |
0.0 | 10.0 |

cgwindow, cgdecay |
1000, 1000 | 3, 2 |

The columns of the stability experiment tables are arranged in four
groups. The first group has ``-'' and ``x'' symbols for each of the
binarized parameters, or pairs of parameters. A ``-'' indicates the
parameter or pair of parameters were set to their ``off'' state as
defined in Table 5.4, while ``x'' indicates the
``on'' state from the same table. The `mm` column represents the
state of the pair `modelmin` and `modelmax`, the `mar` column represents
`margin`, `rrl` represent `rrlambda`, and `cgw` represents the pair `cgwindow` and
`cgdecay`.

The second, third and forth groups of columns represent the
performance attained when the stability parameters are set as
indicated by the first group of columns. The title ``Loose Epsilon''
above the second group indicates that `cgeps` and `lreps` were set to
the rather large values 0.1 and 0.5, respectively. The third group
uses moderate epsilons, with `cgeps` set to 0.001 and `lreps` set to
0.1. The fourth group has ``tight'' epsilons, with `cgeps` set to
0.000001 and `lreps` set to 0.0001. The sub-columns of each group
represent the AUC score, whether NaN values were encountered
during computation, the minimum average deviance achieved during the
ten folds of the cross-validation, and the number of real seconds
elapsed during computation. We do not provide confidence intervals
for the scores because the focus is on stability and not on optimal
performance or speed. Our indication of stabile computations is a
good score and a good speed, as judged against other results in the
same table.

The purpose of the Loose Epsilon, Moderate Epsilon and Tight Epsilon groups is to explore how well stability parameters compensate for different optimality criteria. Once we have analyzed the stability parameters using their binarized value we can explore optimal settings. After this work with the stability parameters is finished, Section 5.2.2 will focus on finding widely-applicable termination criteria which balance optimality and speed.