Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NLU Architecture and ML Model Management in Clova

NLU Architecture and ML Model Management in Clova

2019/3/7 Machine Learning Production Pitch #1
Yuki Matoba

LINE Developers

March 07, 2019
Tweet

More Decks by LINE Developers

Other Decks in Technology

Transcript

  1. NLU Architecture and
 ML Model Management in Clova Machine Learning

    Production Pitch #1 March 7th, 2019 Yuki Matoba (LINE Corporation) 1
  2. ࣗݾ঺հ • Yuki Matoba (త৔ ༐थ) • LINE Clova VA։ൃνʔϜɹ

    • Clova NLU system • Rekcurd • GitHub: @yuki-mt 2
  3. Clova user’s speech data Speech Recognition user’s 
 speech data

    Ͷ͑ Clova, ࠓͷ৽॓ͷఱؾ͸? ݱࡏͷ৽॓ͷఱؾ͸੖Ε (The weather of the present Shinjuku is sunny) 8
  4. Clova user’s speech data Speech Recognition recognized
 text list user’s

    
 speech data Ͷ͑ Clova, ࠓͷ৽॓ͷఱؾ͸? ݱࡏͷ৽॓ͷఱؾ͸੖Ε (The weather of the present Shinjuku is sunny) ࠓͷ৽॓ͷ ఱؾ͸? 9
  5. Clova NLU / DM recognized
 text list user’s speech data

    Speech Recognition recognized
 text list user’s 
 speech data Ͷ͑ Clova, ࠓͷ৽॓ͷఱؾ͸? ݱࡏͷ৽॓ͷఱؾ͸੖Ε (The weather of the present Shinjuku is sunny) ࠓͷ৽॓ͷ ఱؾ͸? 10
  6. Clova Key Value Domain Weather Intention Inform Main Goal General

    Place Shinjuku Time Present NLU Result NLU / DM recognized
 text list user’s speech data Speech Recognition recognized
 text list user’s 
 speech data Ͷ͑ Clova, ࠓͷ৽॓ͷఱؾ͸? ݱࡏͷ৽॓ͷఱؾ͸੖Ε (The weather of the present Shinjuku is sunny) ࠓͷ৽॓ͷ ఱؾ͸? 11
  7. Clova Weather web API generated text NLU
 Result NLU Result

    ࠓͷ৽॓ͷ ఱؾ͸? NLU / DM recognized
 text list user’s speech data Speech Recognition recognized
 text list user’s 
 speech data Ͷ͑ Clova, ࠓͷ৽॓ͷఱؾ͸? ݱࡏͷ৽॓ͷఱؾ͸੖Ε Key Value Domain Weather Intention Inform Main Goal General Place Shinjuku Time Present 12
  8. Clova Speech Synthesis generated text Weather web API generated text

    NLU
 Result Key Value Domain Weather Intention Inform Main Goal General Place Shinjuku Time Present NLU Result NLU / DM recognized
 text list user’s speech data Speech Recognition recognized
 text list user’s 
 speech data Ͷ͑ Clova, ࠓͷ৽॓ͷఱؾ͸? ݱࡏͷ৽॓ͷఱؾ͸੖Ε ࠓͷ৽॓ͷ ఱؾ͸? 13
  9. Clova Speech Synthesis synthesized 
 speech data synthesized speech data

    Weather web API generated text NLU
 Result NLU Result NLU / DM recognized
 text list user’s speech data Speech Recognition recognized
 text list user’s 
 speech data Ͷ͑ Clova, ࠓͷ৽॓ͷఱؾ͸? generated text ࠓͷ৽॓ͷ ఱؾ͸? Key Value Domain Weather Intention Inform Main Goal General Place Shinjuku Time Present 14
  10. Clova Key Value Domain Weather Intention Inform Main Goal General

    Place Shinjuku Time Present NLU Result NLU / DM recognized
 text list ࠓͷ৽॓ͷ ఱؾ͸? 16 NLUγεςϜ͕͍ͯ͠Δ͜ͱ(͓͞Β͍) 1. ൃ࿩ͷจࣈྻΛड͚औΔ 2. ߏ଄Խ͞ΕͨσʔλΛฦ͢
  11. NLUγεςϜͷ࢓༷มߋͷେม͞ • ྫ: ʮകӍ͕͍͔ͭڭ͑ͯʯͱ͍͏ൃ࿩Λweather domainͰ৽͘͠ ରԠ͍ͨ͠৔߹ɺ֬ೝ͢΂͖߲໨͸ • ৽͍͠ϞσϧͰʮകӍ͕͍͔ͭڭ͑ͯʯ͸৽͍͠࢓༷௨Γʹ൑ผͰ͖͍ͯΔ͔ • ʮകӍ͕͍͔ͭڭ͑ͯʯҎ֎ͷݴ͍ճ͠͸ͲΕ͚ͩΧόʔͰ͖͍ͯΔ͔

    • Χόʔͨ͠ݴ͍ճ͠ʹผͷDomain, Intention, Main Goalͱ൑ఆ͢΂͖΋ͷ͸ೖ͍ͬͯ ͳ͍͔ • ͜ͷมߋ͸ผͷػೳʹӨڹΛ༩͑ͳ͍͔ • ଞͷػೳʹӨڹΛ༩͑Δ৔߹ɺͲͷcontextͷ࣌ʹͲͷൃ࿩ͰͲͷػೳ͕༏ઌ͞ΕΔͷ ͔ • … 18
  12. NLUγεςϜͷ࢓༷มߋͷେม͞ • ྫ: ʮകӍ͕͍͔ͭڭ͑ͯʯͱ͍͏ൃ࿩Λweather domainͰ৽͘͠ ରԠ͍ͨ͠৔߹ɺ֬ೝ͢΂͖߲໨͸ • ৽͍͠ϞσϧͰʮകӍ͕͍͔ͭڭ͑ͯʯ͸৽͍͠࢓༷௨Γʹ൑ผͰ͖͍ͯΔ͔ • ʮകӍ͕͍͔ͭڭ͑ͯʯҎ֎ͷݴ͍ճ͠͸ͲΕ͚ͩΧόʔͰ͖͍ͯΔ͔

    • Χόʔͨ͠ݴ͍ճ͠ʹผͷDomain, Intention, Main Goalͱ൑ఆ͢΂͖΋ͷ͸ೖ͍ͬͯ ͳ͍͔ • ͜ͷมߋ͸ผͷػೳʹӨڹΛ༩͑ͳ͍͔ • ଞͷػೳʹӨڹΛ༩͑Δ৔߹ɺͲͷcontextͷ࣌ʹͲͷൃ࿩ͰͲͷػೳ͕༏ઌ͞ΕΔͷ ͔ • … → ։ൃͷӨڹൣғ΍֬ೝࣄ߲͕ଟ͘ɺϦϦʔε·Ͱʹ͕͔͔࣌ؒΔ 19
  13. NLUγεςϜͷ(Ұ෦ͷ)ΞʔΩςΫνϟ Domain Detector weather rule-based NLU Global Pattern NLU other

    domain rule-based NLU1 : ॲཧͷॱ൪ weather ML-based NLU other domain ML-based NLU1 other domain rule-based NLU2 other domain ML-based NLU2 weather domain-specific NLU 20
  14. NLUγεςϜͷ(Ұ෦ͷ)ΞʔΩςΫνϟ Domain Detector weather rule-based NLU Global Pattern NLU other

    domain rule-based NLU1 weather ML-based NLU other domain ML-based NLU1 other domain rule-based NLU2 other domain ML-based NLU2 Global Pattern NLU • ࠷΋༏ઌ౓͕ߴ͍
 → ଞͷίϯϙʔωϯτͷӨڹड͚ͳ͍ • ओʹϏδωε্ॏཁͳυϝΠϯΛ
 ϧʔϧϕʔεͰਖ਼֬ʹ࣮૷ 21
  15. NLUγεςϜͷ(Ұ෦ͷ)ΞʔΩςΫνϟ Domain Detector weather rule-based NLU Global Pattern NLU other

    domain rule-based NLU1 weather ML-based NLU other domain ML-based NLU1 other domain rule-based NLU2 other domain ML-based NLU2 Domain Detector • υϝΠϯ෼ྨ͕੹຿ • ൑ఆͨ͠υϝΠϯʹରԠ͢Δ
 domain-specific NLUʹॲཧΛ౉͢ • ଟΫϥε෼ྨʹͷΈूத͢Ε͹ྑ͍ 22
  16. NLUγεςϜͷ(Ұ෦ͷ)ΞʔΩςΫνϟ Domain Detector weather rule-based NLU Global Pattern NLU other

    domain rule-based NLU1 weather ML-based NLU other domain ML-based NLU1 other domain rule-based NLU2 other domain ML-based NLU2 domain-specific NLU (DSNLU) • 1ͭͷυϝΠϯͷΈ୲౰ • DomainҎ֎ͷɺNLUͰऔಘ͢΂͖ ৘ใΛऔಘ͢Δͷ͕੹຿ • ଞͷDomainʹӨڹ͕ͳ͍ͷͰ
 มߋ͕༰қ 23
  17. NLUγεςϜͷ(Ұ෦ͷ)ΞʔΩςΫνϟ Domain Detector weather rule-based NLU Global Pattern NLU other

    domain rule-based NLU1 weather ML-based NLU other domain ML-based NLU1 other domain rule-based NLU2 other domain ML-based NLU2 domain-specific rule-based NLU • ࡉ͔͍࢓༷΁ͷରԠΛߦ͏ • ϧʔϧϕʔεͳͷͰਖ਼֬ʹ
 ࢓༷௨Γʹ࣮૷͢Δ͜ͱ͕Մೳ 24
  18. NLUγεςϜͷ(Ұ෦ͷ)ΞʔΩςΫνϟ Domain Detector weather rule-based NLU Global Pattern NLU other

    domain rule-based NLU1 weather ML-based NLU other domain ML-based NLU1 other domain rule-based NLU2 other domain ML-based NLU2 domain-specific ML-based NLU • rule-based NLUͰर͑ͳ͔ͬͨ
 ൃ࿩ʹରԠ • rule-based΄Ͳਖ਼֬Ͱ͸ͳ͍΋ͷͷ
 ޿ൣғͷൃ࿩ΛΧόʔͰ͖Δ 25
  19. ػցֶशϞσϧͷ؅ཧͷࡉ͔͍εςοϓ • format data • train a new model •

    evaluate the model • share the result to members • Implement API to use model • estimate machine resource • set machine resource • Implement client side • deploy to development environment • set endpoint for dev API • confirm it works expectedly • deploy to staging environment • set endpoint for staging API • QA • minor fix • fix some training data • train a new model • deploy a fixed model to staging environment • set endpoint for staging API • deploy to production environment • blue-green deployment • set endpoint for production API • AB Test • monitoring • trouble shooting • rollback • … 29
  20. What is Rekcurd ػցֶशϞδϡʔϧͷ഑৴, ؅ཧ, ӡ༻Λ؆୯ʹ͢ΔOSS https://github.com/rekcurd • ഑৴Λ؆୯ʹ →

    Rekcurd • ؅ཧͱӡ༻Λ؆୯ʹ → Rekcurd Dashboard (x Kubernetes) • طଘγεςϜ΁ͷ౷߹Λ؆୯ʹ → Rekcurd Client 31
  21. What is Airflow • Workflow؅ཧOSSπʔϧ • DAG (༗޲ඇ८ճάϥϑ) ͱݺ͹ΕΔάϥϑͰWorkflowΛఆٛ •

    DAGͷఆٛ͸PythonίʔυͰߦ͏ • GUIͰDAGͷՄࢹԽ, ࣮ߦ, ࣮ߦͷεέδϡʔϦϯά͕Մೳ • GUI͔ΒPythonίʔυ΍ϩά΋֬ೝͰ͖Δ 36
  22. Example DAG 38 ֶश → ϞσϧΛRekcurd DashboardʹΞοϓϩʔυ → sandbox؀ڥͰ࢖͏ →

    (ฒྻͰ)ධՁ༻σʔλͷΞοϓϩʔυ → ϞσϧͷੑೳධՁ → ݁ՌΛग़ྗ → if ੑೳ͕ྑ͍: development؀ڥͰ࢖͏ (ಈ࡞֬ೝͷεςοϓʹਐΉͨΊ) → else: ϞσϧΛRekcurd Dashboard͔Β࡟আ
  23. ػցֶशϞσϧͷ؅ཧͷࡉ͔͍εςοϓ • format data • train a new model •

    evaluate the model • share the result to members • Implement API to use model • estimate machine resource • set machine resource • Implement client side • deploy to development environment • set endpoint for dev API • confirm it works expectedly • deploy to staging environment • set endpoint for staging API • QA • minor fix • fix some training data • train a new model • deploy a fixed model to staging environment • set endpoint for staging API • deploy to production environment • blue-green deployment • set endpoint for production API • AB Test • monitoring • trouble shooting • rollback • … 40
  24. ػցֶशϞσϧͷ؅ཧͷࡉ͔͍εςοϓ • format data • train a new model •

    evaluate the model • share the result to members • Implement API to use model • estimate machine resource • set machine resource • Implement client side • deploy to development environment • set endpoint for dev API • confirm it works expectedly • deploy to staging environment • set endpoint for staging API • QA • minor fix • fix some training data • train a new model • deploy a fixed model to staging environment • set endpoint for staging API • deploy to production environment • blue-green deployment • set endpoint for production API • AB Test • monitoring • trouble shooting • rollback • … 41 ɹɹɹɹ: AirflowͱRekcurdͰ ࣗಈԽ, ؆қԽ Ͱ͖ͨ෦෼
  25. ػցֶशϞσϧͷ؅ཧํ๏ͷ·ͱΊ • ػցֶशϞσϧͷֶशʹAirflowΛಋೖ • ϞσϧͷֶशΛ΄΅ࣗಈԽ • ֶशաఔ, ϩάͷՄࢹԽ • Airflow͔ΒRekcurdʹΞΫηε͢Δ͜ͱͰϞσϧͷ഑৴ΛࣗಈԽ

    • Ϟσϧ࡞੒ → Ξοϓϩʔυ&഑৴ΛϊϯετοϓͰ࣮ߦ • ϞσϧͷੑೳධՁ΋࣮ߦ • ػցֶशϞσϧͷӡ༻ʹRekcurdΛಋೖ • Ϟσϧͷ੾Γସ͑ͳͲͷ؅ཧΛGUI͔Βߦ͑Δ • k8s্ͰػցֶशͷαʔϏε͕HAߏ੒Ͱӡ༻͞ΕΔ 42