** Next:** 5.2 IRLS Parameter Evaluation
** Up:** 5.1 Preliminaries
** Previous:** 5.1.5 Computing Platform
** Contents**

##

5.1.6 Scope

In the sections and subsections that follow, we explore many ways to
improve the stability, accuracy and speed of LR computations. This
presentation is broken into two sections for our two LR parameter
estimation methods. Section 5.2 discusses
variations on IRLS with CG. We will refer to this combination simply
as IRLS. Section 5.3 discusses variations on
MLE, where CG is the numerical method used to find the optimum
parameter estimate. This combination will be called CG-MLE. We will
be using the datasets, scoring method, and computing platform
described above.

For the rest of this chapter, ``parameters'' will no longer refer to
the LR parameters
. The parameters discussed below are
implementation parameters that control which variations of LR
computations are being used or how the computations proceed. For
example, the `modelmax` parameter makes an adjustment to the LR
expectation function, while the `cgeps` parameter is an error bound
used for termination of CG iterations. Our goal in exploring these
variations is to choose an implementation which is stable, correct,
fast and autonomous. Since an autonomous classifier cannot require
humans to micro-manage run-time parameters, we will seek default
settings which meet our stability, correctness and speed goals on a
wide variety of datasets. The six real-world datasets described in
Section 5.1.3 will be used to evaluate our
implementation and support our decisions.

In the IRLS and CG-MLE sections we divide the implementation
parameters into three categories, according to their proposed purpose.
These categories are

- controlling the stability of computations
- controlling termination and optimality of the final solution
- enhancing speed

Many of the parameters belong to multiple categories. For example,
proper termination of CG requires numerically stable iterations, and
hence depends on stability parameters. Each parameter will be
discussed in the context that motivated its inclusion in our
experiments.
The parameters in each category will be thoroughly tested for
effectiveness. Parameters which consistently enhance performance for
all of the datasets will have default values assigned. These defaults
will be chosen after further empirical evaluation, with optimality of
the AUC score preferred over speed. Each section ends with a
summary of the useful techniques and the default values chosen for the
corresponding parameters. Our final LR implementations will be
characterized and compared in Chapter 6.

** Next:** 5.2 IRLS Parameter Evaluation
** Up:** 5.1 Preliminaries
** Previous:** 5.1.5 Computing Platform
** Contents**
Copyright 2004 Paul Komarek, komarek@cmu.edu