Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Markov Chains and Markov State Models in Molecular Dynamics

Markov Chains and Markov State Models in Molecular Dynamics

An introduction to Markov chains, and their use in molecular dynamics simulations.

Mike O'Connor

April 11, 2018
Tweet

More Decks by Mike O'Connor

Other Decks in Science

Transcript

  1. Outline We will cover: • Markov Chains by example of

    the drunken walk. • Some of the key mathematical results (for MSMs). • Markov State Models for molecular dynamics. • Some of the stuff I found confusing. This will not be: • How to make a good Markov Model. • Proofs of all the things.
  2. Reading These slides are the agglomeration and synthesis of the

    following: • Chapter 2,3 and 4.1 of Takis Konstantopoulos’s lecture notes. • Wikipedia on Markov Chains is pretty good! Markov State Models for Molecular Dynamics: • Bowman, G. R. (2014), Advances in Experimental Medicine and Biology, 797, 7–22. • Pande, V. S., Beauchamp, K., & Bowman, G. R. (2010). • Noé, F., & Fischer, S. (2008). Current Opinion in Structural Biology, 18(2), 154–162.
  3. Markov Chains • Model a stochastic process by assuming there

    is no (or little) memory. • Used in lots of applications: • Speech Recognition (Siri etc). • Text prediction. • Molecular dynamics. • Google search.
  4. A Drunken Walk: Discrete Space and Time. 6 0 1

    2 3 4 5 Markov Property: Where the drunk goes next depends only on their current position.
  5. A Drunken Walk: transition probabilities. 7 0.8 0.2 0.9 0.1

    0.1 0.9 0.7 0.3 0.5 0.5 0.5 0.5 0 1 2 3 4 5 Transition Probability: Given the drunk is in state ! at step ", the probability that they will transition to another state at step " + 1. Intuition check: Starting in state 0, what’s the probability of state 1 one step later, then state 2 one step after that?
  6. Mathematical Notation • !" : Random variable for state at

    step %. • &(!"() = +) : The probability of being in state + at step % + 1. Markov Property Again: & !"() = +"() !" = +" , !"0) = +"0) , … , !2 = +2 ) = &(!"() = +"() | !" = +" ) Short Hand (Time homogenous): 456 ≡ &(!"() = +| !" = 8)
  7. Joint Probability and Initial Conditions. The evolution of a chain

    follows our intuitive picture (proof?): ! "# = %# , "' = %' , … , ")*' = %)*' , ") = %) = ! "# = %# ! "' = %' "# = %# … ! ") = %) ")*' = %)*' ) We write this as: = , %# -./,.0 … -.10 ,.1
  8. A Drunken Walk: End to end. 12 0.8 0.2 0.9

    0.1 0.1 0.9 0.7 0.3 0.5 0.5 0.5 0.5 0 1 2 3 4 5 ! "# = 0, "' = 1, … , "* = 5 ? Assuming , 0 = 1.
  9. Transition Matrix & Distributions • We’d like to ask more

    general questions: • What’s the probability of being in state 5 after 10 steps, if I start in state 0? • What’s the distribution of states after 100 states, if I start in a random state? • What happens if we leave the drunk there for an infinite amount of time? • To do that, we start making use of matrices and linear algebra.
  10. Distributions • Even more notation! • Let ! be the

    distribution of states: • !" ≔ $" % = ' (" = % )∈+ • Let !, be the initial distribution. • !- = [1, 0] – The drunk is definitely in state 0. • !- = [0.5, 0.5] – The drunk is somewhere 15
  11. Distributions • The matrix makes evaluating the joint probabilities more

    straight forward. • Distribution 1 step later: • "# = "% & • Distribution ' steps later: • "( = "% &)
  12. Distributions Calculate !" , !$ ? What’s the probability of

    being in state 1 after 3 steps? 0.1 0.5 0.5 0 1 % = 0.5 0.5 0.1 0.9 !, = [1, 0] 0.9
  13. Stationarity • Consider a distribution with the following property: •

    ! = !# • Distribution does not change when put through the transition matrix. • This is a special distribution known as the Stationary Distribution. • Notice this is an eigenvalue problem!
  14. Skip over lots of maths. • If a Markov chain

    is irreducible and aperiodic then: lim $→& '$ → ( • Where each row of ( is ). • ) represents the long-term, or equilibrium distribution. • Furthermore, ) is unique • This sounds like a molecular dynamics system! • Expected return time for state * is 1/-. .
  15. Why Care? Google Search - PageRank • Surf the web

    as a random process. • Count how many links in and out each website has. • Use that to form transition matrix. 0.1 BLG BBC NYT
  16. Why Care? Google Search - PageRank • Stationary distribution is

    probability of being on a given website. • More important websites => more inward links. • The real algorithm is a lot more complicated now! 0.1 BLG BBC NYT 0.5 0.5 0.25 0.75
  17. Calculate the stationary distribution. 0.1 0.5 0.5 0 1 !

    = 0.5 0.5 0.1 0.9 (! = ( ? lim -→/ !- ? 0.9
  18. Calculate the stationary distribution. 0.1 0.5 0.5 0 1 !

    = 0.5 0.5 0.1 0.9 (! = ( ? lim -→/ !- = 0.167 0.833 0.167 0.833 0.9
  19. Markov Chains: A Summary • A Markovian process that is

    discrete in time, states or some combination. • Markov Property (multiple ways of saying the same thing): • The next state is only dependent on the current state. • The process is “memoryless”, it does not remember where it has been, only where it is now. • Stationary distribution: Time independent distribution. • For an irreducible Markov Chain, it is unique and the limiting distribution
  20. Markov State Models For Molecular Dynamics The high level way

    of thinking about a molecular system: • A model of N states, with rates of going between the different states. • Could be protein folding, or a reaction. • MSM: Use Markov Chains to analyse MD in a way that gives us this high level information 0.1 0.5 0.5 R P 0.9
  21. Markov State Model Recipe 1. Run loads of MD. 2.

    Project onto lower dimensional space (contacts, dihedrals, tICA) 3. Discretize the data by clustering in N microstates (kmeans, etc). 4. Count number of transitions between states after lag time ! 5. Convert from counts to transition matrix. 6. We have a Markov model! " # + 1 ! = " #! '(!) 7. Test Markov model is self-consistent. 8. Coarse-grain the model to make it more human-readable.
  22. Some nice properties • Molecular systems should be ergodic so

    the stationary distribution is equilibrium distribution. • Molecular systems satisfy detailed balance: !" #$,& = !$ #&,$ • The flux of stuff from i to j matches the flux from j to i. • These conditions mean that the eigenvalues can be interpreted as modes. F Noe, S. Fischer, Current Opinion in Structural Biology, 2008 W. Swope et al, JPC B, 2004
  23. Eigenvectors and Modes • Eigenvector !"with eigenvalue #" • (1

    − #" ) is the probability of transition described by !" occurring. • #( = 1: The equilibrium distribution • #" ~1: Slow modes – These are what we care about! • #" ~0: Fast modes. • Good review on this: F Noe, S. Fischer, Current Opinion in Structural Biology, 2008
  24. Lagtimes • Clustering MD straight into microstates => Not Markovian.

    • Have to discretize time further to ignore such effects. 0 1 2 3 4 5
  25. Implied Timescales • Relaxation time of a mode: !" =

    − % ln()" ) • Chapman-Kolmogrov: !" should be constant if we substitute for +% • Implied timescale plots do this. • Also help identify separation of timescales. http://www.emma- project.org/v2.4/generated/pentapeptide_msm.html
  26. Conclusions • Markov Chains are a nice way of modelling

    stochastic systems. • There are loads of different types & varieties. The irreducible variety described here have particularly useful properties. • Molecular systems map onto Markov chains well. • The process of building a Markov state model sounds straightforward, but the devil is in the detail. There are a lot of variables! • Discretization choice. • Coarse graining. • Sampling problems. • Rob’s talk next week will be on this problem.