14.2 The Information on Information Criterion

Building off the previous work, one approach to model assessment examines the number of parameters in the model \(f\) and \(g\) and evaluates the tradeoff between model complexity (i.e. the number of parameters used) and the overall likelihood. Information criteria are used to assess the tradeoffs between model complexity and the number of parameters. The goal of information criteria is to determine the best approximating model.

There are several types of information criteria, but we are going to focus on two::

The Akaike Information Criteria (AIC, Akaike (1974)) is a commonly used information criteria:

\[\begin{equation} AIC = -2 LL_{max} + 2 P \tag{14.1} \end{equation}\]

An alternative to the AIC is the Bayesian Information Criterion (BIC, Schwartz (1978))

\[\begin{equation} BIC = -2 LL_{max} + P \ln (N), \tag{14.2} \end{equation}\] In Equations (14.1) and (14.2) \(N\) is the number of data points, \(P\) is the number of estimated parameters, and \(LL_{max}\) is the log-likelihood for the parameter set that maximized the likelihood function. In both cases, a lower value of the information criteria indicates greater support for the model from the data. For both Equations (14.1) and (14.2) show the dependence on the log likelihood function and the number of parameters.

Let’s evaluate how the AIC and BIC compare for the global temperature data. When we have a statistical model fit, computing these are fairly easy to compute.

For you R purists, you could also use the function AIC or BIC. To apply them you need to first do the model fit (with the function lm⁴:

regression_formula <- globalTemp ~ 1 + yearSince1880 
fit <- lm(regression_formula,data=global_temperature)
AIC(fit)

## [1] -93.83421

BIC(fit)

## [1] -85.11838

We can then make a table comparing the different models and their AIC:

Model	AIC
Linear	-93.834
Quadratic	-140.769
Cubic	-168.98
Quartic	-167.198

These results show that the cubic model is the better approximating model.

References

Akaike, Hirotugu. 1974. “A New Look at the Statistical Model Identification.” IEEE Transactions on Automatic Control 19 (6): 716–23. https://doi.org/10.1109/TAC.1974.1100705.

Schwartz, G. 1978. “Estimating the Dimensions of a Model.” Annals of Statistics 6 (2): 461–64.

You can compute the log likelihood with the function logLik(fit), where fit is the result of your linear model fits.↩︎