Introduction to Likelihood-based methods

5779dc3bf8c8a0c9dadb7ff95c67e9e3?s=47 Corey Chivers
May 03, 2013
410

Introduction to Likelihood-based methods

5779dc3bf8c8a0c9dadb7ff95c67e9e3?s=128

Corey Chivers

May 03, 2013
Tweet

Transcript

  1. 2.

    Script Format • The script is divided into sections: ##@

    x.x @## … ...some commands... … ### -- ###
  2. 3.

    ##@ 0.1 @## • We will need to install a

    package: ##@ 0.1 @## rm(list=ls()) # Housekeeping install.packages('emdbook') # Bolker's data library(emdbook) source('http://madere.biol.mcgill.ca/cchivers/ccMLE.R') ### -- ###
  3. 4.

    The Likelihood Principle L |x∝ P X=x| • All information

    contained in data x, with respect to inference about the value of θ, is contained in the likelihood function:
  4. 6.

    The Likelihood Function L |x∝ P X=x| Where θ is(are)

    our parameter(s) of interest ex: Attack rate Fitness Mean body mass Mortality etc... L |x=f | x
  5. 8.

    The Ecologist's Quarter • 1) Before flipping, what is the

    probability that I will flip tails, given that I am flipping an ecologist's quarter (p(tail=0.6))? • 2) What is the likelihood that I am flipping an ecologist's quarter, given the flip(s) that I have observed? Px |=0.6 L=0.6| x
  6. 9.

    L |x=∏ t=1 T  ∏ h=1 H 1− The

    Ecologist's Quarter L=0.6| x=H T TH T =∏ t=1 3 0.6∏ h=1 2 0.4 =0.03456
  7. 10.

    L |x=∏ t=1 T  ∏ h=1 H 1− The

    Ecologist's Quarter L=0.6| x=H T TH T =∏ t=1 3 0.6∏ h=1 2 0.4 =0.03456 But what does this mean? 0.03456 ≠ P(θ) !!!!
  8. 11.

    ##@ 1.1-2 @## • @1.1 defines the likelihood function •

    @1.2 flips an ecologist's quarter n times, and plots the likelihood function. • Vary the number of flips (n), re-run, and see what happens. • Vary the true θ parameter and see what happens. • Show your neighbour your likelihood plot and see if they can guess θ. How about n?
  9. 12.
  10. 13.

    Likelihood So Far... • The likelihood function is just the

    product of the individual likelihoods of each observation. • The likelihood function carries all information contained in the data relevant to inference about parameters. • Likelihood is a REALATIVE measure, and gets smaller as more data is observed. • The 'best fitting' parameter value is the one that maximizes this function.
  11. 14.

    Maximum Likelihood Estimation (MLE) • MLE is a process of

    fitting model parameters to data. → “What are the parameter values which are most likely to have generated my data?” • For many models, the maximum likelihood value can not be found analytically. • In these cases, we use numerical optimization. >?optim
  12. 15.

    Computers hate really small numbers • As we have seen,

    as n gets larger, the likelihood can become very small. • Instead of maximizing the likelihood function itself, we maximize the log likelihood. log ∏ x=∑ log x
  13. 16.

    ##@ 1.3 @## >m_eco_q<-optim(par=0.5, fn=log_likelihood_ecol_quarter, lower=0,upper=1, # range of params

    to search method='Brent', # Use this method for 1D models control=list("fnscale"=-1), hessian=TRUE, q_flips=flips) Where the eff is this peak?
  14. 17.

    ##@ 1.3 @## $par [1] 0.5 $value [1] -6.931472 $counts

    function gradient NA NA $convergence [1] 0 $message NULL $hessian [,1] [1,] -40.00032 Best fit parameter value (aka the MLE) Log likelihood at the MLE If ($convergence == 0) user = :-)
  15. 18.

    ##@ 1.3 @## >confint(m_eco_q) 2.5 % 97.5 % par 1

    0.1901037 0.8098963 WARNING: Assumes that the L(θ|x) is normally distributed. Usually safe with large n.
  16. 19.

    Formulating Likelihood Functions • Okay, that was easy, but how

    can I determine the likelihood function for the way I think the world works (aka my model)? • A Model is just a mathematical representation of a process, often including both Deterministic and, Stochastic components. • The likelihood function comes straight from the specification of the model.
  17. 20.

    Example: Tadpole Predation ##@ 2.1 @## > data(ReedfrogFuncresp) # Load

    Frog predation data > ReedfrogFuncresp Initial Killed 1 5 1 2 5 2 3 10 5 4 10 6 5 15 10 6 15 9 7 20 7 8 20 10 ...
  18. 21.
  19. 22.

    Example: Tadpole Predation • Suppose tadpole predators have a Holling

    type- II functional response: • Realized number eaten is binomial with probability p: p= a 1ahN k ~Binom p, N Deterministic Stochastic
  20. 24.

    Example: Tadpole Predation #@2.3 Function Name Arguments Parameters Data binomLL2

    <- function(pars,data) { a = pars[1] h = pars[2] predprob = a/(1+a*h*data$N) sum(dbinom(data$k,prob=predprob, size=data$N, log=TRUE)) }
  21. 25.

    Example: Tadpole Predation #@2.3 Parameters binomLL2 <- function(pars,data) { a

    = pars[1] h = pars[2] predprob = a/(1+a*h*data$N) sum(dbinom(data$k,prob=predprob, size=data$N, log=TRUE)) }
  22. 26.

    Example: Tadpole Predation #@2.3 Deterministic component binomLL2 <- function(pars,data) {

    a = pars[1] h = pars[2] predprob = a/(1+a*h*data$N) sum(dbinom(data$k,prob=predprob, size=data$N, log=TRUE)) }
  23. 27.

    binomLL2 <- function(pars,data) { a = pars[1] h = pars[2]

    predprob = a/(1+a*h*data$N) sum(dbinom(data$k,prob=predprob, size=data$N, log=TRUE)) } Example: Tadpole Predation #@2.3 Stochastic Component
  24. 28.

    ##@ 2.4-6 @## • Compute the maximum likelihood values of

    the model parameters. (@2.4) • Compute the approximate confidence interval and plot the likelihood surface (@2.5) • Plot the fitted model prediction and (stochastic) confidence intervals over the data. (@2.6)
  25. 29.

    ##@ 2.4 @## >m2 #Output of MLE routine $par [1]

    0.52593924 0.01660613 $value [1] -46.72136 $counts function gradient 53 NA $convergence [1] 0 $message NULL $hessian [,1] [,2] [1,] -616.2803 7408.079 [2,] 7408.0792 -131327.243 Best fit parameter values (aka the MLE) Log likelihood at the MLE
  26. 30.

    ##@ 2.5 @## > confint(m2,parnames = c("a","h")) 2.5 % 97.5

    % a 0.386789631 0.66508885 h 0.007073913 0.02613834 > domain<-confint(m2) > surface_plot(binomLL2,domain=domain,data=frogs)
  27. 33.

    But what if this isn't the right model? • How

    would we know? • Since the model space is infinite, all we can do is propose a suite of models, and allow them to compete in the arena (octagon?) of data. Model 1 Model 2
  28. 34.

    Extension: Model Selection • Until now, we have been concerned

    with finding the best fitting parameter values, but what if we have competing models? • These competing models may not have the same number of parameters. LM |, x=?
  29. 35.

    Extension: Model Selection • Akaike Information Criteria: • k=number of

    parameters L=Maximum likelihood value • Better models have lower AIC values • Just like the likelihood itself, AIC is a RELATIVE measure. AIC=2k−2logL Hirotugo Akaike
  30. 36.

    Akaike Weights • Step 1: Compute the MLE for each

    candidate model • Step 2: Compute each model's difference, Δ i , from the 'best' model (lowest AIC value) • Step 3: Compute w i w i = exp−1/2 i  ∑ m=1 M exp−1/2 m 
  31. 37.

    ##@ 3.1-4 @## • Define the likelihood functions for two

    alternative models. (@3.1) • Try to write down the mathematical formulation of each model. • Compute the AIC for each of the three competing models. (@3.2) • Compute the model weights for each of the three competing models. (@3.3) • Which is the best fitting model? • How will these values vary if we were to add more competing models to our set? • Plot the predictions of each model (@3.4)
  32. 40.

    Summary • The likelihood function is simply the product of

    the individual likelihoods of each observation. • Likelihood is a REALATIVE measure, and gets smaller as more data is observed. • The 'best fitting' parameter value is the one that maximizes the likelihood function (MLE). • Numerical optimization can be used to find the MLE. • Competing models can be compared using AIC, which is based on each model's MLE.