$30 off During Our Annual Pro Sale. View Details »

拡散モデルによる画像生成(CVIMチュートリアル)

mi141
November 15, 2023

 拡散モデルによる画像生成(CVIMチュートリアル)

2023年5月開催の情報処理学会CVIM研究会において行ったチュートリアル講演の資料です。
拡散モデルによる画像生成について、「拡散モデルの原理と学習方法」「生成処理の制御(画像変換・編集への応用)」「生成処理の高速化(微分方程式との関係)」の3つに分けて解説しています。

mi141

November 15, 2023
Tweet

Other Decks in Research

Transcript

  1. View Slide

  2. View Slide

  3. View Slide

  4. View Slide

  5. View Slide

  6. View Slide

  7. View Slide



  8. 𝑝(𝑥)
    {𝑥𝑖
    }𝑖=1
    𝑁
    𝑥~𝑝(𝑥)

    View Slide




  9. 𝑝(𝑧)
    𝑧~𝑝(𝑧)






    {𝑥𝑖
    }𝑖=1
    𝑁
    𝑥 = 𝑓(𝑧)

    View Slide


  10. View Slide




  11. View Slide

  12. View Slide




  13. View Slide




  14. View Slide




  15. View Slide



  16. 𝑞 𝑥𝑡
    𝑥𝑡−1
    ≔ 𝑁(𝑥𝑡
    ; 1 − 𝛽𝑡
    𝑥𝑡−1
    , 𝛽𝑡
    𝐈)
    𝑥𝑡
    = 1 − 𝛽𝑡
    𝑥𝑡−1
    + 𝛽𝑡
    𝜖
    𝑞 𝑥1:𝑇
    𝑥0
    = ෑ
    𝑡=1
    𝑇
    𝑞 𝑥𝑡
    𝑥𝑡−1
    𝑥𝑖 𝑡=1
    𝑇

    View Slide

  17. • 𝑥0
    𝑥𝑡

    𝑥0
    𝑥𝑡
    = 1 − 𝛽𝑡
    𝑥𝑡−1
    + 𝛽𝑡
    𝜖
    = 1 − 𝛽𝑡
    ( 1 − 𝛽𝑡−1
    𝑥𝑡−2
    + 𝛽𝑡−1
    𝜖′) + 𝛽𝑡
    𝜖
    = 1 − 𝛽𝑡
    1 − 𝛽𝑡−1
    𝑥𝑡−2
    + 1 − (1 − 𝛽𝑡
    )(1 − 𝛽𝑡−1
    )𝜖′′
    = … = ത
    𝛼𝑡
    𝑥0
    + 1 − ത
    𝛼𝑡
    𝜖 ത
    𝛼𝑡
    = ς𝑖=1
    𝑡 (1 − 𝛽𝑖
    )

    View Slide

  18. • 𝑥0
    𝑥𝑡

    𝑥0
    𝑥𝑡
    = 1 − 𝛽𝑡
    𝑥𝑡−1
    + 𝛽𝑡
    𝜖
    = 1 − 𝛽𝑡
    ( 1 − 𝛽𝑡−1
    𝑥𝑡−2
    + 𝛽𝑡−1
    𝜖′) + 𝛽𝑡
    𝜖
    = 1 − 𝛽𝑡
    1 − 𝛽𝑡−1
    𝑥𝑡−2
    + 1 − (1 − 𝛽𝑡
    )(1 − 𝛽𝑡−1
    )𝜖′′
    = … = ത
    𝛼𝑡
    𝑥0
    + 1 − ത
    𝛼𝑡
    𝜖 ത
    𝛼𝑡
    = ς𝑖=1
    𝑡 (1 − 𝛽𝑖
    )
    𝛽𝑡

    𝛼𝑡

    View Slide




  19. View Slide


  20. 𝑥𝑡−1
    𝑥𝑡
    𝑥𝑡−1
    𝑥𝑡
    1 − 𝛽𝑡
    𝑥𝑡−1
    𝑥𝑡−1
    𝛽𝑡

    View Slide


  21. 𝑥𝑡−1
    𝑥𝑡
    𝑥𝑡−1
    𝑥𝑡
    1 − 𝛽𝑡
    𝑥𝑡−1
    𝑥𝑡−1
    𝛽𝑡
    𝛽𝑡
    𝑝𝜃
    𝑥𝑡−1
    𝑥𝑡
    = 𝑁(𝑥𝑡−1
    ; 𝜇𝜃
    (𝑥𝑡
    ,𝑡), 𝚺𝜃
    (𝑥𝑡
    ,𝑡))
    𝜎𝑡
    2𝑰
    𝜎𝑡
    2 = 𝛽𝑡

    View Slide

  22. # Initialization
    xt = random.normal(0, 1, (3,H,W))
    # Reverse diffusion process
    for t in range(T, 0, -1):
    # estimate mean
    mu = estimate_mean(model, xt, t)
    # use fixed sigma
    sigma = beta[t] ** 0.5
    # sample x_{t-1}
    xt = mu + sigma * random.normal(0, 1, (3,H,W))
    # return x_0
    return xt

    View Slide

  23. # Initialization
    xt = random.normal(0, 1, (3,H,W))
    # Reverse diffusion process
    for t in range(T, 0, -1):
    # estimate mean
    mu = estimate_mean(model, xt, t)
    # use fixed sigma
    sigma = beta[t] ** 0.5
    # sample x_{t-1}
    xt = mu + sigma * random.normal(0, 1, (3,H,W))
    # return x_0
    return xt

    View Slide




  24. View Slide

  25. • 𝑝𝜃
    (𝑥) 𝑝data
    (𝑥)

    𝜃∗ = arg min
    𝜃
    𝐷𝐾𝐿
    (𝑝data
    (𝑥)||𝑝𝜃
    (𝑥))
    = arg min
    𝜃
    𝔼𝑝data(𝑥)
    log𝑝data
    (𝑥) − 𝔼𝑝data(𝑥)
    log 𝑝𝜃
    (𝑥)
    = arg min
    𝜃
    𝔼𝑝data(𝑥)
    − log 𝑝𝜃
    (𝑥)

    View Slide


  26. − log 𝑝𝜃
    𝑥0
    = − log න 𝑝𝜃
    𝑥0:𝑇
    d𝑥1:𝑇
    = − log න 𝑞(𝑥1:𝑇
    |𝑥0
    )
    𝑝𝜃
    𝑥0:𝑇
    𝑞(𝑥1:𝑇
    |𝑥0
    )
    d𝑥1:𝑇
    ≤ − න 𝑞 𝑥1:𝑇
    𝑥0
    log
    𝑝𝜃
    𝑥0:𝑇
    𝑞 𝑥1:𝑇
    𝑥0
    d𝑥1:𝑇
    = 𝔼𝑞(𝑥1:𝑇|𝑥0)
    log
    𝑞 𝑥1:𝑇
    𝑥0
    𝑝𝜃
    𝑥0:𝑇
    log𝔼𝑞
    [𝑉] ≥ 𝔼𝑞
    [log𝑉]
    𝑥0
    𝑥0:𝑇
    𝑥0
    𝑥1:𝑇

    View Slide

  27. • 𝑞 𝑥1:𝑇
    𝑥0
    = ෑ
    𝑡=1
    𝑇
    𝑞 𝑥𝑡
    𝑥𝑡−1
    = 𝐷KL
    𝑞 𝑥𝑇
    𝑥0
    ԡ𝑝𝜃
    𝑥𝑇
    − 𝔼𝑞(𝑥1|𝑥0)
    log 𝑝𝜃
    (𝑥0
    |𝑥1
    )
    + ෍
    𝑡=2
    𝑇
    𝔼𝑞(𝑥𝑡|𝑥0)
    𝐷KL
    (𝑞(𝑥𝑡−1
    |𝑥𝑡
    ,𝑥0
    )ԡ𝑝𝜃
    (𝑥𝑡−1
    |𝑥𝑡
    ))
    𝑁(𝑥𝑡−1
    ; 𝜇𝜃
    (𝑥𝑡
    ,𝑡),𝜎𝑡
    2𝑰)
    𝑁(𝑥𝑡−1
    ; ෤
    𝜇𝑡
    (𝑥𝑡
    ,𝑥0
    ), ෨
    𝛽𝑡
    𝑰)
    𝑥𝑡
    𝑥0
    𝑥𝑡−1

    View Slide

  28. • 𝑞 𝑥1:𝑇
    𝑥0
    = ෑ
    𝑡=1
    𝑇
    𝑞 𝑥𝑡
    𝑥𝑡−1
    = 𝐷KL
    𝑞 𝑥𝑇
    𝑥0
    ԡ𝑝𝜃
    𝑥𝑇
    − 𝔼𝑞(𝑥1|𝑥0)
    log 𝑝𝜃
    (𝑥0
    |𝑥1
    )
    + ෍
    𝑡=2
    𝑇
    𝔼𝑞(𝑥𝑡|𝑥0)
    1
    2𝜎𝑡
    2

    𝜇𝑡
    𝑥𝑡
    ,𝑥0
    − 𝜇𝜃
    (𝑥𝑡
    , 𝑡) 2 + const.
    𝑥𝑡

    𝜇𝑡

    View Slide



  29. 𝜇𝑡
    𝑥𝑡
    ,𝑥0
    = 𝑐0
    𝑥0
    + 𝑐1
    𝑥𝑡
    =
    𝑐0

    𝛼𝑡
    + 𝑐1
    𝑥𝑡

    𝑐0
    1 − ത
    𝛼𝑡

    𝛼𝑡
    𝜖
    𝑐0
    =
    𝛽𝑡
    1 − 𝛽𝑡−1
    1 − ത
    𝛼𝑡
    , 𝑐1
    =
    1 − 𝛽𝑡
    (1 − ത
    𝛼𝑡−1
    )
    1 − ത
    𝛼𝑡
    𝑥𝑡
    = ത
    𝛼𝑡
    𝑥0
    + 1 − ത
    𝛼𝑡
    𝜖
    𝑥𝑡

    𝜇𝑡
    𝑥𝑡
    𝑥0
    𝑥𝑡
    𝜖

    View Slide

  30. • 𝑞 𝑥1:𝑇
    𝑥0
    = ෑ
    𝑡=1
    𝑇
    𝑞 𝑥𝑡
    𝑥𝑡−1
    = 𝐷KL
    𝑞 𝑥𝑇
    𝑥0
    ԡ𝑝𝜃
    𝑥𝑇
    − 𝔼𝑞(𝑥1|𝑥0)
    log 𝑝𝜃
    (𝑥0
    |𝑥1
    )
    + ෍
    𝑡=2
    𝑇
    𝔼𝑞(𝑥𝑡|𝑥0)
    1
    2𝜎𝑡
    2

    𝜇𝑡
    𝑥𝑡
    ,𝑥0
    − 𝜇𝜃
    (𝑥𝑡
    , 𝑡) 2 + const.
    𝜖

    View Slide

  31. • 𝜖


    = 𝔼𝑡,𝜖
    𝜖 − 𝜖𝜃
    ( ത
    𝛼𝑡
    𝑥0
    + 1 − ത
    𝛼𝑡
    𝜖, 𝑡)
    2

    View Slide




  32. 𝑡~𝑈(1,𝑇)
    𝑥0
    𝜖~𝑁(0,1)
    𝜖𝜃

    𝛼𝑡
    𝑥0
    + 1 − ത
    𝛼𝑡
    𝜖

    View Slide




  33. • 𝛽𝑡
    • 𝛽1
    = 10−4, 𝛽𝑇
    = 0.02
    • 𝜎𝑡
    • 𝜎𝑡
    2 = 𝛽𝑡

    View Slide

  34. 𝑁ℎ
    𝑁𝑚
    𝑁𝑙
    𝑁𝑏
    𝑁𝑙
    𝑁𝑚
    𝑁ℎ
    𝑡

    View Slide

  35. 𝑁ℎ
    𝑁𝑚
    𝑁𝑙
    𝑁𝑏
    𝑁𝑙
    𝑁𝑚
    𝑁ℎ
    𝑡

    View Slide

  36. 𝑁ℎ
    𝑁𝑚
    𝑁𝑙
    𝑁𝑏
    𝑁𝑙
    𝑁𝑚
    𝑁ℎ
    𝑡

    View Slide


  37. View Slide


  38. View Slide

  39. View Slide




  40. View Slide




  41. View Slide


  42. 𝑡

    View Slide


  43. 𝑡

    View Slide


  44. View Slide




  45. View Slide


  46. View Slide

  47. View Slide

  48. View Slide

  49. View Slide






































  50. View Slide



  51. 𝑝𝜃
    (𝑥𝑡−1
    |𝑥𝑡
    )

    View Slide



  52. 𝑝𝜃
    (𝑥𝑡−1
    |𝑥𝑡
    )

    View Slide


  53. View Slide

  54. View Slide


  55. • 𝑡0
    𝑥𝑡0
    𝑡0
    𝑥𝑡0−1
    𝑥𝑡0
    𝑥0
    𝑥𝑇
    𝑡0

    View Slide



  56. View Slide


  57. 𝑦 = 𝐻𝑥 + 𝑛

    View Slide



  58. 𝑥~𝑝(𝑥)
    𝑦
    𝑥~𝑝(𝑥|𝑦)

    View Slide


  59. 𝑥𝑡−1
    𝑥𝑡
    𝑥0
    𝑥𝑇
    𝑦 = 𝐻𝑥 + 𝑛

    View Slide


  60. • 𝑥𝑡
    𝑥𝑡−1
    𝑥𝑡
    𝑥0
    𝑥𝑇
    𝑦 = 𝐻𝑥

    View Slide


  61. View Slide



  62. 𝑥0
    𝑥𝑡−1
    𝑥𝑡
    𝐻
    𝐻
    𝑦 𝜇𝜃
    𝐻
    𝑥𝑡−1
    𝑥𝑡−1

    View Slide


  63. View Slide

  64. View Slide



  65. View Slide


  66. View Slide



  67. View Slide

  68. View Slide

  69. # Initialization
    xt = random.normal(0, 1, (3,H,W))
    # Reverse diffusion process
    for t in range(T, 0, -1):
    # estimate mean
    mu = estimate_mean(model, xt, t)
    # use fixed sigma
    sigma = beta[t] ** 0.5
    # sample x_{t-1}
    xt = mu + sigma * random.normal(0, 1, (3,H,W))
    # return x_0
    return xt

    View Slide

  70. 𝑥𝑡−1
    𝑥𝑡
    𝑥𝑡−2
    𝑥𝑡−1
    𝑥𝑡
    𝑥𝑡−2

    View Slide

  71. View Slide

  72. View Slide

  73. View Slide




  74. 𝑧1
    𝑧𝑇
    𝑧0
    𝐻 × 𝑊 × 3
    𝐻
    𝑛
    ×
    𝑊
    𝑛
    × 𝑐

    View Slide



  75. View Slide

  76. View Slide



  77. View Slide




  78. 𝑥𝑡−1
    𝑥𝑡
    𝑥0
    𝑥𝑇
    𝑥𝑡−1
    𝑥𝑡
    𝑥0
    𝑥𝑇

    View Slide

  79. View Slide


  80. • 𝜏1
    ,… , 𝜏𝑆
    (𝜏𝑖
    ∈ [1,𝑇])
    𝛽
    𝛽𝜏𝑖
    = 1 − ෑ
    𝑗=𝜏𝑖−1+1
    𝜏𝑖
    1 − 𝛽𝑗
    𝜏𝑖−1
    𝜏𝑖
    𝑥𝑡−1
    𝑥𝑡
    𝑥𝑡−2
    𝑥𝑡−1
    𝑥𝑡
    𝑥𝑡−2
    𝜏𝑖

    View Slide


  81. View Slide

  82. View Slide



  83. 𝑥
    𝑡 = 0 𝑡 = 𝑇
    𝑥0
    𝑥𝑇
    𝑥𝑇−1
    𝑥1
    𝑥𝑡+1
    = 1 − 𝛽𝑡+1
    𝑥𝑡
    + 𝛽𝑡+1
    𝜖

    View Slide


  84. 𝑥
    𝑡 = 0 𝑡 = 1
    𝑥0
    𝑥1
    d𝑥 = −
    1
    2
    𝛽 𝑡 𝑥 d𝑡 + 𝛽(𝑡) d𝑤
    𝑤

    View Slide



  85. 𝑁(𝑥𝑡
    ; 1 − 𝛽𝑡
    𝑥𝑡−1
    , 𝛽𝑡
    𝐈)
    𝑁(𝑥𝑡−1
    ; 𝜇𝜃
    (𝑥𝑡
    ,𝑡),𝜎𝑡
    2𝐈)
    d𝑥 = −
    1
    2
    𝛽 𝑡 𝑥 d𝑡 + 𝛽(𝑡) d𝑤

    View Slide



  86. 𝑁(𝑥𝑡
    ; 1 − 𝛽𝑡
    𝑥𝑡−1
    , 𝛽𝑡
    𝐈)
    𝑁(𝑥𝑡−1
    ; 𝜇𝜃
    (𝑥𝑡
    ,𝑡),𝜎𝑡
    2𝐈)
    d𝑥 = −
    1
    2
    𝛽 𝑡 𝑥 d𝑡 + 𝛽(𝑡) d𝑤

    View Slide


  87. 𝑥
    𝑡 = 0 𝑡 = 1
    𝑥0
    𝑥1

    𝑤
    d𝑥 = −𝛽 𝑡
    1
    2
    𝑥 + ∇𝑥
    log 𝑞𝑡
    (𝑥) d𝑡 + 𝛽(𝑡) dഥ
    𝑤

    View Slide




  88. d𝑥 = −𝛽 𝑡
    1
    2
    𝑥 + ∇𝑥
    log 𝑞𝑡
    (𝑥) d𝑡 + 𝛽(𝑡) dഥ
    𝑤 𝑥1
    ~𝑞1
    (𝑥)
    𝑠𝜃
    𝑥 = −
    𝜖𝜃
    (𝑥, 𝑡)
    1 − ത
    𝛼𝑡

    View Slide

  89. ∇𝑥
    log 𝑞𝑡
    (𝑥) 𝜖𝜃
    (𝑥𝑡
    ,𝑡)
    𝔼𝑞𝑡(𝑥)
    1
    2
    𝑠𝜃
    𝑥 − ∇𝑥
    log 𝑞𝑡
    𝑥 2 𝔼𝑥0~𝑞0(𝑥)
    𝜖~𝑁(0,𝐼)
    𝜖𝜃
    ( ത
    𝛼𝑡
    𝑥0
    + 1 − ത
    𝛼𝑡
    𝜖, 𝑡) − 𝜖
    2
    𝑠𝜃
    𝑠𝜃
    𝑥 = −
    𝜖𝜃
    (𝑥,𝑡)
    1 − ത
    𝛼𝑡

    View Slide



  90. 𝑁(𝑥𝑡
    ; 1 − 𝛽𝑡
    𝑥𝑡−1
    , 𝛽𝑡
    𝐈)
    𝑁(𝑥𝑡−1
    ; 𝜇𝜃
    (𝑥𝑡
    ,𝑡),𝜎𝑡
    2𝐈)
    d𝑥 = −
    1
    2
    𝛽 𝑡 𝑥 d𝑡 + 𝛽(𝑡) d𝑤
    d𝑥 = −𝛽 𝑡
    1
    2
    𝑥 −
    𝜖𝜃
    (𝑥, 𝑡)
    1 − ത
    𝛼𝑡
    d𝑡 + 𝛽(𝑡) dഥ
    𝑤

    View Slide



  91. d𝑥 = −𝛽 𝑡
    1
    2
    𝑥 + ∇𝑥
    log 𝑞𝑡
    (𝑥) d𝑡 + 𝛽(𝑡) dഥ
    𝑤 𝑥1
    ~𝑞1
    (𝑥)
    d𝑥 = −𝛽 𝑡
    1
    2
    𝑥 +
    1
    2
    ∇𝑥
    log 𝑞𝑡
    (𝑥) d𝑡 𝑥1
    ~𝑞1
    (𝑥)
    𝑥1
    𝑥0

    View Slide

  92. 𝑥
    𝑡 = 0 𝑡 = 1
    𝑥
    𝑡 = 0 𝑡 = 1
    𝑞0
    (𝑥) 𝑞𝑡
    (𝑥) 𝑞1
    (𝑥) 𝑞0
    (𝑥) 𝑞𝑡
    (𝑥) 𝑞1
    (𝑥)

    View Slide

  93. • ∆𝑡 𝑥 𝑥1
    𝑥0
    𝑥 𝑡 − ∆𝑡 = 𝑥𝑡
    − ∆𝑡 ቤ
    d𝑥
    d𝑡
    𝑥=𝑥𝑡
    𝑥
    𝑡 = 0 𝑡 = 1
    𝑥0
    𝑥∆𝑡
    𝑥1
    𝑥1−∆𝑡
    d𝑥 = −𝛽 𝑡
    1
    2
    𝑥 +
    1
    2
    ∇𝑥
    log 𝑞𝑡
    (𝑥) d𝑡
    ∆𝑡 𝑥

    View Slide

  94. • ∆𝑡
    ∆𝑡

    View Slide

  95. 𝑡
    𝑡 𝑡
    𝑡
    𝜎′1/𝜌
    log 𝛼
    1−𝛼
    𝑡1/𝜌

    View Slide

  96. 𝑡
    𝑡 𝑡
    𝑡
    𝜎′1/𝜌
    log 𝛼
    1−𝛼
    𝑡1/𝜌

    View Slide

  97. 𝑡
    𝑡 𝑡
    𝑡
    𝜎′1/𝜌
    log 𝛼
    1−𝛼
    𝑡1/𝜌
    𝑥
    ∆𝑡

    View Slide

  98. View Slide

  99. View Slide

  100. View Slide

  101. View Slide