next up previous contents
Next: 5.2.1.5 Basic Stability Test: Up: 5.2.1 Indirect (IRLS) Stability Previous: 5.2.1.3 Basic Stability Test:   Contents


5.2.1.4 Basic Stability Test: imdb


Table 5.6: IRLS stability experiments for imdb. binitmean is disabled and wmargin is 0. The first four columns represent the state of modelmin and modelmax, margin, rrlambda, and cgwindow and cgdecay.
           Loose Epsilon Moderate Epsilon  Tight Epsilon             
mm  mar  rrl  cgw AUC  NaN  DEV  Time AUC  NaN  DEV  Time AUC  NaN  DEV  Time
-  -  -  - 0.965  -  9745  120 0.972  -  9236  146 0.908  x  1910  7576
x  -  -  - 0.965  -  9745  120 0.972  -  9236  145 0.904  x  1655  7840
-  x  -  - 0.965  -  8776  120 0.971  -  8366  148 0.911  x  1347  7570
x  x  -  - 0.965  -  8776  121 0.971  -  8366  133 0.907  x  1154  7844
-  -  x  - 0.965  -  9745  120 0.972  -  9236  149 0.983  -  3469  932
x  -  x  - 0.965  -  9745  121 0.972  -  9236  133 0.983  -  3469  930
-  x  x  - 0.965  -  8776  120 0.971  -  8367  135 0.983  -  2915  873
x  x  x  - 0.965  -  8776  121 0.971  -  8367  135 0.983  -  2915  874
-  -  -  x 0.965  -  9745  117 0.972  -  9236  134 0.971  x  2947  693
x  -  -  x 0.965  -  9745  120 0.972  -  9236  148 0.961  x  1762  922
-  x  -  x 0.965  -  8776  120 0.971  -  8366  146 0.973  x  2292  789
x  x  -  x 0.965  -  8776  121 0.971  -  8366  149 0.968  x  1362  985
-  -  x  x 0.965  -  9745  119 0.972  -  9236  148 0.983  -  3469  889
x  -  x  x 0.965  -  9745  120 0.972  -  9236  148 0.983  -  3469  892
-  x  x  x 0.965  -  8776  121 0.971  -  8367  147 0.983  -  2915  834
x  x  x  x 0.965  -  8776  119 0.971  -  8367  150 0.983  -  2915  841

The next dataset to consider is imdb, and the experiments are summarized in Table 5.6. The modelmin and modelmax parameters make some difference in the Tight Epsilon group, where enabling these parameters decreases the deviance and decreases the AUC . We can see that margin affects the deviance in every epsilon range, but produces only small variations in the AUC scores and speed. The rrlambda parameter only affects the tight epsilon experiments, apparently preventing over-fitting. In the tight epsilon range we also see cgwindow and cgdecay improving AUC and time significantly for experiments with rrlambda deactivated, and a decrease in time where rrlambda is activated. Only in the Tight Epsilon group are any NaN values present during computation, and activating rrlambda eliminates NaN values.

Our conclusions are that modelmin and modelmax should not be enabled for the imdb dataset, that margin is not effective, and that rrlambda, cgwindow and cgdecay are useful in preventing out-of-control calculations. It appears that the moderate epsilon experiments are better behaved with the imdb dataset than they were with ds1. Considering that the cgeps parameter is multiplied by the number of attributes, and that imdb has over twenty-five times as many attributes as ds1, we might surmise that this scaling of cgeps is too much.


next up previous contents
Next: 5.2.1.5 Basic Stability Test: Up: 5.2.1 Indirect (IRLS) Stability Previous: 5.2.1.3 Basic Stability Test:   Contents
Copyright 2004 Paul Komarek, komarek@cmu.edu