Upgrade to Pro — share decks privately, control downloads, hide ads and more …

RNA Secondary Structure Prediction

RNA Secondary Structure Prediction

0aa5743bd364213c11abd871b2325f65?s=128

Sumin Byeon

April 25, 2012
Tweet

Transcript

  1. RNA Secondary Structure Prediction C SC 550 - Spring 2012

    Muhammad J. Alam Sumin Byeon Tuesday, April 24, 12
  2. RNA Ribonucleic acid Single-stranded molecule Consists of nucleotides Each nucleotide

    contains a base (A, C, G, U) Tuesday, April 24, 12
  3. RNA Structures Primary structure: Linear sequence of nucleotide bases Secondary

    structure: Hydrogen bonds between bases forming base pairs Tuesday, April 24, 12
  4. RNA Structures Hairpin loop Stacked pair Internal loop Bulge Multi

    loop Tuesday, April 24, 12
  5. Problem Definition Input: primary structure of an RNA Goal: to

    predict the secondary structure Given a primary structure of an RNA, find a secondary structure that maximizes the number of base pairs Tuesday, April 24, 12
  6. Practical Applications Function classification Evolutionary studies Pseudogene detection Tuesday, April

    24, 12
  7. Different Approaches Physical methods (Kim et al) X-ray diffraction, Nuclear

    Magnetic Resonance (NMR) Chemical/enzymatic methods (Ehresmann et al) Mutational analysis (Tang and Draper) Tuesday, April 24, 12
  8. Prediction with Sequence Only Structure prediction based on multiple RNA

    sequences which are structurally similar (Sankoff, Gary and Stormo) Structure prediction based on a single RNA sequence Nussinov Folding Algorithm, Zuker Algorithm Tuesday, April 24, 12
  9. Assumptions Three base pairs (A-U, C-G, G-U) One base forms

    at most one base pair Pseudoknots do not occur Tuesday, April 24, 12
  10. Pseudoknots g a c a g u g u u

    c Tuesday, April 24, 12
  11. Pseudoknots g a c a g u g u u

    c Tuesday, April 24, 12
  12. Nussinov Folding Algorithm Case 1: (1) and (n) form a

    pair Case 2: There is (k) that is not crossed by any pair where 1 < k < n n . . . 1 2 Tuesday, April 24, 12
  13. Nussinov Folding Algorithm Case 1: (1) and (n) form a

    pair V(1, n) = V(2, n-1) + δ(S[1], S[n]) n . . . 1 2 Tuesday, April 24, 12
  14. Nussinov Folding Algorithm Case 1: (1) and (n) form a

    pair V(1, n) = V(2, n-1) + δ(S[1], S[n]) n . . . 1 2 (x, y) = ⇢ 1, if(x, y) 2 (a, u), (u, a), (c, g), (g, c), (g, u), (u, g) 0, otherwise Tuesday, April 24, 12
  15. Nussinov Folding Algorithm Case 1: (1) and (n) form a

    pair Case 2: There is (k) that is not crossed by any pair where 1 < k < n V(1, n) = V(1, k) + V(k+1, n) n . . . 1 2 k Tuesday, April 24, 12
  16. Nussinov Folding Algorithm Dynamic programming V ( i, j )

    = max ⇢ V ( i + 1 , j 1) + ( S [ i ] , S [ j ]) maxik<i { V ( i, k ) + V ( k + 1 , j )} i j ... ... Tuesday, April 24, 12
  17. Nussinov Folding Algorithm Dynamic programming . . . V (

    i, j ) = max ⇢ V ( i + 1 , j 1) + ( S [ i ] , S [ j ]) maxik<i { V ( i, k ) + V ( k + 1 , j )} Tuesday, April 24, 12
  18. Alternate Optimization Goal Find the most stable structure: Zuker Algorithm

    The hydrogen bond at a base pair tries to stabilize the structure Free bases inside a loop tries to disrupt the structure Difference between these two is the destabilizing energy Given a primary structure of an RNA, find the secondary structure with least total energy Tuesday, April 24, 12
  19. Destabilizing Energy Measure Stacked Pair : eS(i, j) Stabilizes the

    structure eS(i, j) is negative Hairpin : eH(i, j) The bigger the loop, the more unstable the structure is eH(i, j) depends on |j-i+1| Tuesday, April 24, 12
  20. Destabilizing Energy Measure Internal Loop or Bulge : eL(i, j,

    i', j') The bigger the loop is and the more asymmetric the two sides are, the more unstable is the structure eL(i, j, i', j') depends on (|i'-i+1|+|j'-j+1|) and the asymmetry Multi-loop : eM(i1, j1, i2, j2, ..., ik, jk) The structure is more unstable if the loop size and k is big Tuesday, April 24, 12
  21. Zuker Algorithm Finds a secondary structure with minimum total destabilizing

    energy Uses a dynamic Programming Running Time Exponential Tuesday, April 24, 12
  22. Demo Tuesday, April 24, 12

  23. Conclusion Summary An algorithm that finds a secondary structure with

    the maximum number of base pairs Future works Develop an algorithm that does not make the assumption of absence of pseudoknots (Gary and Stormo) Develop an algorithm that addresses base triples and other types of base pairs Tuesday, April 24, 12
  24. Thank you Tuesday, April 24, 12