4.2 Maximum Likelihood Estimation

Recall that the outcome is a Bernoulli random variable with mean
in the LR model. Therefore
we may interpret the expectation function as the probability that
, or equivalently that
belongs to the positive class.
Thus we may compute the probability of the
experiment
and outcome in the dataset
as

P | (4.5) | ||

(4.6) |

From this expression we may derive likelihood and log-likelihood of the data under the LR model with parameters as

The likelihood and log-likelihood functions are nonlinear in and cannot be solved analytically. Therefore numerical methods are typically used to find the MLE . CG is a popular choice, and by some reports CG provides as good or better results for this task than any other numerical method tested to date [27]. The time complexity of this approach is simply the time complexity of the numerical method used.