Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Perceptron, Support Vector Machine, and Passive Aggressive Algorithm.

Perceptron, Support Vector Machine, and Passive Aggressive Algorithm.

Original presentation at Computational Linguistics Lab, Nara Institute of Science and Technology.

奈良先端科学技術大学院大学(NAIST)自然言語処理学研究室のプログラミング勉強会にて行った発表のスライド資料です。

Sorami Hisamoto

May 14, 2013
Tweet

More Decks by Sorami Hisamoto

Other Decks in Research

Transcript

  1. Perceptron,
    Support Vector Machine, and
    Passive Aggressive Algorithm.
    Sorami Hisamoto
    14 May 2013, PG

    View Slide

  2. / 37
    2
    Disclaimer
    This material gives a brief impression of how these algorithms work.
    This material may contain wrong explanations and errors.
    Please refer to other materials for more detailed and reliable information.

    View Slide

  3. / 37
    What is “Machine Learning” ?
    3
    “Field of study that gives computers the ability to learn
    without being explicitly programmed.” Arthur Samuel, 1959.

    View Slide

  4. / 37
    Types of Machine Learning Algorithms
    - Supervised Learning ڭࢣ͋Γֶश
    - Unsupervised Learning ڭࢣͳֶ͠श
    - Semi-supervised Learning ൒ڭࢣ͋Γֶश
    - Reinforcement Learning ڧԽֶश
    - Active Learning ೳಈֶश
    - ... 4
    by the property of data.

    View Slide

  5. / 37
    Types of Machine Learning Algorithms
    - Binary Classification ೋ஋෼ྨ
    - Regression ճؼ
    - Multi Class Classification ଟ஋෼ྨ
    - Sequential Labeling ܥྻϥϕϦϯά
    - Learning to Rank ϥϯΫֶश
    - ... 5
    by the property of problem.

    View Slide

  6. / 37
    Types of Machine Learning Algorithms
    - Batch Learning
    - Online Learning
    6
    by the parameter optimization strategy.

    View Slide

  7. / 37
    Linear binary classification
    7
    Given X, predict Y.

    View Slide

  8. / 37
    Linear binary classification
    7
    Given X, predict Y.
    Output Y is binary;
    e.g. +1 or -1.

    View Slide

  9. / 37
    Linear binary classification
    7
    Given X, predict Y.
    Output Y is binary;
    e.g. +1 or -1.
    e.g.
    X - email.
    Y - spam or not.

    View Slide

  10. / 37
    1. Get features from the input.
    2. Calculate inner product of the feature vector and weights.
    3. If result ≧ 0 output is +1, else -1.
    Linear binary classification
    7
    Given X, predict Y.
    Output Y is binary;
    e.g. +1 or -1.
    e.g.
    X - email.
    Y - spam or not.

    View Slide

  11. / 37
    1. Get features from the input.
    2. Calculate inner product of the feature vector and weights.
    3. If result ≧ 0 output is +1, else -1.
    Linear binary classification
    7
    Given X, predict Y.
    Output Y is binary;
    e.g. +1 or -1.
    e.g.
    X - email.
    Y - spam or not.

    View Slide

  12. / 37
    1. Get features from the input.
    2. Calculate inner product of the feature vector and weights.
    3. If result ≧ 0 output is +1, else -1.
    Linear binary classification
    7
    Given X, predict Y.
    Output Y is binary;
    e.g. +1 or -1.
    e.g.
    X - email.
    Y - spam or not.

    View Slide

  13. / 37
    1. Get features from the input.
    2. Calculate inner product of the feature vector and weights.
    3. If result ≧ 0 output is +1, else -1.
    Linear binary classification
    7
    Given X, predict Y.
    Output Y is binary;
    e.g. +1 or -1.
    e.g.
    X - email.
    Y - spam or not.
    ?

    View Slide

  14. / 37
    1. Get features from the input.
    2. Calculate inner product of the feature vector and weights.
    3. If result ≧ 0 output is +1, else -1.
    Linear binary classification
    7
    Given X, predict Y.
    Output Y is binary;
    e.g. +1 or -1.
    e.g.
    X - email.
    Y - spam or not.
    ?
    !

    View Slide

  15. / 37
    Implementing a linear binary classifier
    8

    View Slide

  16. / 37
    Implementing a linear binary classifier
    8
    How do we learn
    the weights?

    View Slide

  17. / 37
    9
    Perceptron
    Support
    Vector
    Machine
    Passive
    Aggressive
    Algorithm

    View Slide

  18. / 37
    10
    Perceptron
    Support
    Vector
    Machine
    Passive
    Aggressive
    Algorithm

    View Slide

  19. / 37
    Perceptron [Rosenblatt 1957]
    - For every sample:
    - If prediction is correct, do nothing.
    - If label=+1 and prediction=-1, add feature vector to the weights.
    - If label=-1 and prediction=+1, subtract feature vector from the weights.
    11

    View Slide

  20. / 37
    Perceptron [Rosenblatt 1957]
    - For every sample:
    - If prediction is correct, do nothing.
    - If label=+1 and prediction=-1, add feature vector to the weights.
    - If label=-1 and prediction=+1, subtract feature vector from the weights.
    11
    Hinge loss ώϯδଛࣦ
    loss (w, x, y) = max(0, -ywx)
    Stochastic gradient descent method ֬཰తޯ഑߱Լ๏
    ∂loss() / ∂w = max(0, -yx)
    x: input vector, w: weight vector, y: correct label (+1 or -1)

    View Slide

  21. / 37
    Perceptron [Rosenblatt 1957]
    - For every sample:
    - If prediction is correct, do nothing.
    - If label=+1 and prediction=-1, add feature vector to the weights.
    - If label=-1 and prediction=+1, subtract feature vector from the weights.
    11
    Hinge loss ώϯδଛࣦ
    loss (w, x, y) = max(0, -ywx)
    Stochastic gradient descent method ֬཰తޯ഑߱Լ๏
    ∂loss() / ∂w = max(0, -yx)
    x: input vector, w: weight vector, y: correct label (+1 or -1)
    Sum up these 2 procedures as 1:
    + y * x

    View Slide

  22. / 37
    Learning hyperplane: Illustrated
    12
    Figure from http://d.hatena.ne.jp/AntiBayesian/20111125/1322202138 .

    View Slide

  23. / 37
    Implementing a perceptron
    13

    View Slide

  24. / 37
    Implementing a perceptron
    13

    View Slide

  25. / 37
    Implementing a perceptron
    13
    loss (w, x, y) = max(0, -ywx)

    View Slide

  26. / 37
    Implementing a perceptron
    13
    loss (w, x, y) = max(0, -ywx)

    View Slide

  27. / 37
    Implementing a perceptron
    13
    ∂loss(w, x, y) / ∂w = - yx
    loss (w, x, y) = max(0, -ywx)

    View Slide

  28. / 37
    14
    Perceptron
    Support
    Vector
    Machine
    Passive
    Aggressive
    Algorithm

    View Slide

  29. / 37
    15
    Perceptron
    Support
    Vector
    Machine
    Passive
    Aggressive
    Algorithm

    View Slide

  30. / 37
    16

    View Slide

  31. / 37
    SVM [Vapnik&Cortes 1995]
    - Perceptron, plus ...
    -Margin maximizing.
    -(Kernel).
    17
    Support Vector Machine

    View Slide

  32. / 37
    Which one looks better, and why?
    18
    All 3 classify correctly but ...
    the middle one seems the best.

    View Slide

  33. / 37
    Which one looks better, and why?
    18
    All 3 classify correctly but ...
    the middle one seems the best.

    View Slide

  34. / 37
    Margin maximizing
    19
    “Vapnik–Chervonenkis theory” ensures that if margin is maximized,
    classification performance on unknown data will be maximized.

    View Slide

  35. / 37
    Margin maximizing
    19
    “Vapnik–Chervonenkis theory” ensures that if margin is maximized,
    classification performance on unknown data will be maximized.

    View Slide

  36. / 37
    Margin maximizing
    19
    support
    vector
    “Vapnik–Chervonenkis theory” ensures that if margin is maximized,
    classification performance on unknown data will be maximized.

    View Slide

  37. / 37
    Margin maximizing
    19
    support
    vector
    margin
    “Vapnik–Chervonenkis theory” ensures that if margin is maximized,
    classification performance on unknown data will be maximized.

    View Slide

  38. / 37
    Margin maximizing
    19
    support
    vector
    margin
    “Vapnik–Chervonenkis theory” ensures that if margin is maximized,
    classification performance on unknown data will be maximized.
    SVM’s loss function
    max(0, λ - ywx) + α * w^2 / 2
    margin(w) = λ / w^2
    x: input vector, w: weight vector, y: correct label (+1 or -1), λ&α: hyperparameters.

    View Slide

  39. / 37
    Margin maximizing
    19
    support
    vector
    margin
    “Vapnik–Chervonenkis theory” ensures that if margin is maximized,
    classification performance on unknown data will be maximized.
    SVM’s loss function
    max(0, λ - ywx) + α * w^2 / 2
    If prediction correct
    BUT score < λ ,
    then get penalty.
    margin(w) = λ / w^2
    x: input vector, w: weight vector, y: correct label (+1 or -1), λ&α: hyperparameters.

    View Slide

  40. / 37
    Margin maximizing
    19
    support
    vector
    margin
    “Vapnik–Chervonenkis theory” ensures that if margin is maximized,
    classification performance on unknown data will be maximized.
    SVM’s loss function
    max(0, λ - ywx) + α * w^2 / 2
    As w^2 becomes smaller,
    margin (λ / w^2) becomes bigger.
    If prediction correct
    BUT score < λ ,
    then get penalty.
    margin(w) = λ / w^2
    x: input vector, w: weight vector, y: correct label (+1 or -1), λ&α: hyperparameters.

    View Slide

  41. / 37
    Margin maximizing
    19
    support
    vector
    margin
    “Vapnik–Chervonenkis theory” ensures that if margin is maximized,
    classification performance on unknown data will be maximized.
    SVM’s loss function
    max(0, λ - ywx) + α * w^2 / 2
    As w^2 becomes smaller,
    margin (λ / w^2) becomes bigger.
    If prediction correct
    BUT score < λ ,
    then get penalty.
    margin(w) = λ / w^2
    x: input vector, w: weight vector, y: correct label (+1 or -1), λ&α: hyperparameters.
    For a detailed explanation, refer to other materials;
    e.g. [਺ݪ 2011, p.34-39] http://d.hatena.ne.jp/sleepy_yoshi/20110423/p1

    View Slide

  42. / 37
    Perceptron and SVM
    20
    loss(w, x, y) ∂loss(w, x, y) / ∂w
    Perceptron
    SVM
    max(0, -ywx) max(0, -yx)
    max(0, λ - ywx) + α * w^2 / 2 - yx + αw
    x: input vector, w: weight vector, y: correct label (+1 or -1), λ&α: hyperparameters.

    View Slide

  43. / 37
    Implementing SVM
    21

    View Slide

  44. / 37
    Implementing SVM
    21

    View Slide

  45. / 37
    Implementing SVM
    21
    loss(w, x, y) =
    max(0, λ - ywx)
    + α * w^2 / 2

    View Slide

  46. / 37
    Implementing SVM
    21
    loss(w, x, y) =
    max(0, λ - ywx)
    + α * w^2 / 2

    View Slide

  47. / 37
    Implementing SVM
    21
    ∂loss(w, x, y) / ∂w =
    - tx + αw
    loss(w, x, y) =
    max(0, λ - ywx)
    + α * w^2 / 2

    View Slide

  48. / 37
    Soft-margin
    - Sometimes impossible to linearly separate.
    - → Soft-margin
    - Permit violation of margin.
    - If margin negative, give penalty.
    - Minimize penalty and maximize margin.
    22

    View Slide

  49. / 37
    23
    Perceptron
    Support
    Vector
    Machine
    Passive
    Aggressive
    Algorithm

    View Slide

  50. / 37
    24
    Perceptron
    Support
    Vector
    Machine
    Passive
    Aggressive
    Algorithm

    View Slide

  51. / 37
    PA [Crammer+ 2006]
    - Passive:
    If prediction correct, do nothing.
    - Aggressive:
    If prediction wrong, minimally update the weights to correctly classify.
    25
    Passive Aggressive Algorithm

    View Slide

  52. / 37
    PA [Crammer+ 2006]
    - Passive:
    If prediction correct, do nothing.
    - Aggressive:
    If prediction wrong, minimally update the weights to correctly classify.
    25
    Passive Aggressive Algorithm
    minimally change

    View Slide

  53. / 37
    PA [Crammer+ 2006]
    - Passive:
    If prediction correct, do nothing.
    - Aggressive:
    If prediction wrong, minimally update the weights to correctly classify.
    25
    Passive Aggressive Algorithm
    minimally change
    ... and correctly
    classify.

    View Slide

  54. / 37
    Passive & Aggressive: Illustrated
    26
    Figure from http://kazoo04.hatenablog.com/entry/2012/12/20/000000 .

    View Slide

  55. / 37
    Passive & Aggressive: Illustrated
    26
    Figure from http://kazoo04.hatenablog.com/entry/2012/12/20/000000 .
    Do nothing.

    View Slide

  56. / 37
    Passive & Aggressive: Illustrated
    26
    Figure from http://kazoo04.hatenablog.com/entry/2012/12/20/000000 .
    Do nothing.
    Move minimally
    to classify correctly.

    View Slide

  57. / 37
    Implementing PA
    27

    View Slide

  58. / 37
    PA vs. Perceptron & SVM
    28
    - PA always correctly classify the last-seen data.
    - But not with Perceptron & SVM, as the update size is constant.
    - → PA seems more efficient, but is weaker to noise than Perceptron&SVM.

    View Slide

  59. / 37
    PA, or MIRA?
    29
    MIRA (Margin Infused Relaxed Algorithm)
    [Crammer+ 2003]
    PA (Passive Aggressive Algorithm)
    [Crammer+ 2006]

    View Slide

  60. / 37
    PA, or MIRA?
    29
    “... MIRA͸ઢܗ෼཭Մೳͳ໰୊ʹ͔͠ରԠ͓ͯ͠Βͣɺ
    ͜ΕΛൃలͤͨ͞΋ͷ͕PA[2]ͱͳ͍ͬͯΔɻ
    ·ͨMIRAΛར༻ͨ͠ͱᨳ͍ͬͯΔݚڀͷ΄ͱΜͲ͸
    ࣮ࡍʹ͸[2]Λ࢖͍ͬͯΔɻ”
    “... ΦϦδφϧͷMIRA͸Ϟσϧͷߋ৽ͷେ͖͞Λ࠷খԽ͢Δ
    ͷͰ͸ͳ͘ɺߋ৽ޙͷύϥϝʔλͷϊϧϜΛ࠷খԽ͢Δͱ͍͏
    ఺͕ҟͳΔɻ”
    [தᖒ 2009]
    MIRA (Margin Infused Relaxed Algorithm)
    [Crammer+ 2003]
    PA (Passive Aggressive Algorithm)
    [Crammer+ 2006]

    View Slide

  61. / 37
    PA, or MIRA?
    29
    “... MIRA͸ઢܗ෼཭Մೳͳ໰୊ʹ͔͠ରԠ͓ͯ͠Βͣɺ
    ͜ΕΛൃలͤͨ͞΋ͷ͕PA[2]ͱͳ͍ͬͯΔɻ
    ·ͨMIRAΛར༻ͨ͠ͱᨳ͍ͬͯΔݚڀͷ΄ͱΜͲ͸
    ࣮ࡍʹ͸[2]Λ࢖͍ͬͯΔɻ”
    “... ΦϦδφϧͷMIRA͸Ϟσϧͷߋ৽ͷେ͖͞Λ࠷খԽ͢Δ
    ͷͰ͸ͳ͘ɺߋ৽ޙͷύϥϝʔλͷϊϧϜΛ࠷খԽ͢Δͱ͍͏
    ఺͕ҟͳΔɻ”
    [தᖒ 2009]
    https://twitter.com/taku910/status/243760585030901761
    MIRA (Margin Infused Relaxed Algorithm)
    [Crammer+ 2003]
    PA (Passive Aggressive Algorithm)
    [Crammer+ 2006]

    View Slide

  62. / 37
    Expansion of PA
    - PA-I, PA-II
    - Confidence-Weighted Algorithm (CW) [Dredze+ 2008]
    - If a feature appeared frequently in the past, likely to be more reliable (hence update less).
    - Fast convergence.
    - Adaptive Regularization of Weight Vectors (AROW) [Crammer+ 2009]
    - More tolerant to noise than CW.
    - Exact Soft Confidence-Weight Learning (SCW) [Zhao+ 2012]
    - ...
    30

    View Slide

  63. / 37
    Expansion of PA
    - PA-I, PA-II
    - Confidence-Weighted Algorithm (CW) [Dredze+ 2008]
    - If a feature appeared frequently in the past, likely to be more reliable (hence update less).
    - Fast convergence.
    - Adaptive Regularization of Weight Vectors (AROW) [Crammer+ 2009]
    - More tolerant to noise than CW.
    - Exact Soft Confidence-Weight Learning (SCW) [Zhao+ 2012]
    - ...
    30
    Used for Gmailʼs
    “priority inbox”.

    View Slide

  64. / 37
    31
    Perceptron
    Support
    Vector
    Machine
    Passive
    Aggressive
    Algorithm

    View Slide

  65. / 37
    ... and which one should we use? (1) [ಙӬ 2012, p.286]
    1. Perceptron
    - Easiest to implement, as a baseline for other algorithms.
    2. SVM with FOBOS optimization
    - Almost same implementation as perceptron, but the result should go up.
    3. Logistic regression
    4. If not enough... (next slide)
    32

    View Slide

  66. / 37
    ... and which one should we use? (1) [ಙӬ 2012, p.286]
    1. Perceptron
    - Easiest to implement, as a baseline for other algorithms.
    2. SVM with FOBOS optimization
    - Almost same implementation as perceptron, but the result should go up.
    3. Logistic regression
    4. If not enough... (next slide)
    32

    View Slide

  67. / 37
    ... and which one should we use? (1) [ಙӬ 2012, p.286]
    1. Perceptron
    - Easiest to implement, as a baseline for other algorithms.
    2. SVM with FOBOS optimization
    - Almost same implementation as perceptron, but the result should go up.
    3. Logistic regression
    4. If not enough... (next slide)
    32

    View Slide

  68. / 37
    ... and which one should we use? (1) [ಙӬ 2012, p.286]
    1. Perceptron
    - Easiest to implement, as a baseline for other algorithms.
    2. SVM with FOBOS optimization
    - Almost same implementation as perceptron, but the result should go up.
    3. Logistic regression
    4. If not enough... (next slide)
    32

    View Slide

  69. / 37
    ... and which one should we use? (2) [ಙӬ 2012, p.286]
    - If learning speed not enough, try PA, CW, AROW, etc.
    - But be aware they are sensitive to the noise.
    - If accuracy not enough, first pinpoint the cause.
    - Difference between #Positive/Negative examples large.
    - Give special treat to the smaller size examples, give large margin.
    - Reconstruct learning data, and make the positive/negative example size about the same.
    - Too noisy.
    - Reconsider data and features.
    - Difficult to linearly classify?
    - Devise better features.
    - Non-linear classifier. 33

    View Slide

  70. / 37
    ... and which one should we use? (2) [ಙӬ 2012, p.286]
    - If learning speed not enough, try PA, CW, AROW, etc.
    - But be aware they are sensitive to the noise.
    - If accuracy not enough, first pinpoint the cause.
    - Difference between #Positive/Negative examples large.
    - Give special treat to the smaller size examples, give large margin.
    - Reconstruct learning data, and make the positive/negative example size about the same.
    - Too noisy.
    - Reconsider data and features.
    - Difficult to linearly classify?
    - Devise better features.
    - Non-linear classifier. 33

    View Slide

  71. / 37
    ... and which one should we use? (2) [ಙӬ 2012, p.286]
    - If learning speed not enough, try PA, CW, AROW, etc.
    - But be aware they are sensitive to the noise.
    - If accuracy not enough, first pinpoint the cause.
    - Difference between #Positive/Negative examples large.
    - Give special treat to the smaller size examples, give large margin.
    - Reconstruct learning data, and make the positive/negative example size about the same.
    - Too noisy.
    - Reconsider data and features.
    - Difficult to linearly classify?
    - Devise better features.
    - Non-linear classifier. 33

    View Slide

  72. / 37
    ... and which one should we use? (2) [ಙӬ 2012, p.286]
    - If learning speed not enough, try PA, CW, AROW, etc.
    - But be aware they are sensitive to the noise.
    - If accuracy not enough, first pinpoint the cause.
    - Difference between #Positive/Negative examples large.
    - Give special treat to the smaller size examples, give large margin.
    - Reconstruct learning data, and make the positive/negative example size about the same.
    - Too noisy.
    - Reconsider data and features.
    - Difficult to linearly classify?
    - Devise better features.
    - Non-linear classifier. 33

    View Slide

  73. / 37
    ... and which one should we use? (2) [ಙӬ 2012, p.286]
    - If learning speed not enough, try PA, CW, AROW, etc.
    - But be aware they are sensitive to the noise.
    - If accuracy not enough, first pinpoint the cause.
    - Difference between #Positive/Negative examples large.
    - Give special treat to the smaller size examples, give large margin.
    - Reconstruct learning data, and make the positive/negative example size about the same.
    - Too noisy.
    - Reconsider data and features.
    - Difficult to linearly classify?
    - Devise better features.
    - Non-linear classifier. 33

    View Slide

  74. / 37
    Software packages
    - OLL https://code.google.com/p/oll/wiki/OllMainJa
    - Perceptron, Averaged Perceptron, Passive Aggressive, ALMA, Confidence Weighted.
    - LIBSVM http://www.csie.ntu.edu.tw/~cjlin/libsvm/
    - AROW++ https://code.google.com/p/arowpp/
    - ...
    34

    View Slide

  75. / 37
    For further study: Books
    - “೔ຊޠೖྗΛࢧ͑Δٕज़” ಙӬ୓೭, 2012.
    Chapter 5, Appendix 4 & 5.
    - “ݴޠॲཧͷͨΊͷػցֶशೖ໳” ߴଜେ໵, 2010.
    Chapter 4.
    - “Θ͔Γ΍͍͢ύλʔϯೝࣝ” ੴҪ݈Ұ࿠+, 1998.
    Chapter 2 & 3.
    - “An Introduction to SVM” Christiani & Shawe-Talyor, 2000.
    - “αϙʔτϕΫλʔϚγϯೖ໳” ΫϦεςΟΞʔχ&ςΠϥʔ ੺ຊ
    35

    View Slide

  76. / 37
    For further study: on Web - slides and article
    - “਺ࣜΛҰ੾࢖༻͠ͳ͍SVMͷ࿩” rti
    http://prezi.com/9cozgxlearff/svmsvm/
    - “ύʔηϓτϩϯΞϧΰϦζϜ” Graham Neubig.
    http://www.phontron.com/slides/nlp-programming-ja-03-perceptron.pdf
    - “ύʔηϓτϩϯͰָ͍͠஥͕ؒΆΆΆΆʙΜ” ਺ݪྑ඙, 2011.
    http://d.hatena.ne.jp/sleepy_yoshi/20110423/p1
    - “౷ܭతػցֶशೖ໳ 5. αϙʔτϕΫλʔϚγϯ” த઒༟ࢤ.
    http://www.r.dl.itc.u-tokyo.ac.jp/~nakagawa/SML1/kernel1.pdf
    - “MIRA (Margin Infused Relaxed Algorithm)” தᖒහ໌, 2009.
    http://nlp.ist.i.kyoto-u.ac.jp/member/nakazawa/pubdb/other/MIRA.pdf
    36

    View Slide

  77. / 37
    For further study: on Web - blog articles
    - ςΩετϚΠχϯάͷͨΊͷػցֶश௒ೖ໳ ೋ໷໨ ύʔηϓτϩϯ
    ͋Μͪ΂ʂ
    http://d.hatena.ne.jp/AntiBayesian/20111125/1322202138
    - ػցֶश௒ೖ໳II ʙGmailͷ༏ઌτϨΠͰ΋࢖͍ͬͯΔPA๏Λ30෼Ͱशಘ͠Α͏ʂʙ
    EchizenBlog-Zwei
    http://d.hatena.ne.jp/echizen_tm/20110120/1295547335
    - ػցֶश௒ೖ໳III ʙػցֶशͷجૅɺύʔηϓτϩϯΛ30෼Ͱ࡞ֶͬͯͿʙ
    EchizenBlog-Zwei
    http://d.hatena.ne.jp/echizen_tm/20110606/1307378609
    - ػցֶश௒ೖ໳IV ʙSVM(αϙʔτϕΫλʔϚγϯ)ͩͬͯ30෼Ͱ࡞ΕͪΌ͏ˑʙ
    EchizenBlog-Zwei
    http://d.hatena.ne.jp/echizen_tm/20110627/1309188711
    37

    View Slide