Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Japan PredictionIO User Group Meetup #01

Japan PredictionIO User Group Meetup #01

A Slide for JPIOUG (Japan PredictionIO User Group) Meetup #01
第1回 PredictionIO勉強会の発表資料です。
https://d-cube.connpass.com/event/48590/

takahiro-hagino

January 22, 2017
Tweet

More Decks by takahiro-hagino

Other Decks in Technology

Transcript

  1. Topics Introduction to Apache PredictionIO PIOͱ͸ͳʹ͔ System Architecture PIOͷΞʔΩςΫνϟ Quick

    Start PIOΛಈ͔ͯ͠ΈΑ͏ Implementation of Engine Template ΤϯδϯςϯϓϨʔτΛͭ͘Δʹ͸
  2. Apache PredictionIO? Apache PredictionIO (incubating) is an open source Machine

    Learning Server built on top of state-of- the-art open source stack for developers and data scientists create predictive engines for any machine learning task.
  3. Apache PredictionIO let you ςϯϓϨʔτ͔Β༧ଌΤϯδϯΛ࡞Γɺ
 ͙͢ʹWebαʔϏεͱͯ͠σϓϩΠͰ͖Δ quickly build and deploy

    an engine as a web service on production with customizable templates; ϦΞϧλΠϜʹΫΤϦ΁݁ՌΛฦ͢͜ͱ͕Ͱ͖Δ respond to dynamic queries in real-time once deployed as a web service;
  4. Apache PredictionIO let you ޡࠩͷௐ੔΍ɺධՁͷ࢓૊Έ͕༻ҙ͞Ε͍ͯΔ evaluate and tune multiple engine

    variants systematically; όονͰ΋ϦΞϧλΠϜͰ΋͋ΒΏΔϓϥοτ
 ϑΥʔϜ͔ΒͷσʔλΛ·ͱΊͯूΊΒΕΔ unify data from multiple platforms in batch or in real-time for comprehensive predictive analytics;
  5. Apache PredictionIO let you ࢓૊ΈԽ͞ΕͨΤϯδϯςϯϓϨʔτ͕͋Γػցֶ शͷϞσϧ࡞੒͕ૉૣ͘Ͱ͖Δ speed up machine learning

    modeling with systematic processes and pre-built evaluation measures; Spark MLLib΍OpenNLPͳͲػցֶशɺ
 σʔλॲཧϥΠϒϥϦΛલఏͱ͢Δ support machine learning and data processing libraries such as Spark MLLib and OpenNLP;
  6. Apache PredictionIO let you ࣗ෼ͷֶशϞσϧΛ࣮૷ͯ͠Τϯδϯʹ૊ࠐΊΔ implement your own machine learning

    models and seamlessly incorporate them into your engine; σʔλΠϯϑϥͷ؅ཧ͕༰қʹͳΔ simplify data infrastructure management.
  7. The Story Behind the Frog ΧΤϧʢଞɺΞϦɺͯΜͱ͏஬ʣ͸ ؾީͷมԽ͔Βɺ஍਒Λ༧ଌͰ͖Δɻ ͜͏ͯ͠PredictionIOͷΧΤϧ͸OSS ͷಈ෺ԂʹՃΘͬͨͷͩɻ “I

    end up finding out other animals like ants, frogs, ladybugs etc having the ability to predict various attributes from temperature change to earthquake, and finally settled on the frog.”
 
 “PredictionIO’s logo (the Frog) joins a veritable zoo of other famous open-source logos featuring animals.” 
 The Story Behind the Frog blog.prediction.io/story-behind-frog-logo/
  8. Initial Committers Pat Ferrell ActionML Tamas Jambor Channel4 Justin Yip

    independent Xusen Yin USC Lee Moon Soo NFLabs Donald Szeto Salesforce
  9. Versions Latest Release Version v0.9.6 Current Version v.0.10.0-incubating Road Map

    issues.apache.org/jira/browse/PIO/?selectedTab=com.atlassian.jira.jira- projects-plugin:roadmap-panel
  10. PIO CLI status Displays status information about PredictionIO version Displays

    the version of this command line console template Creates a new engine based on an engine template
  11. PIO CLI build Build an engine at the current train

    Kick off a training using an engine deploy Deploy an engine as an engine server
  12. PIO CLI accesskey Manage app access keys export Export events

    from the Event Server run Launch a driver progra eval Kick off an evaluation using an engine dashboard Launch an evaluation dashboard
  13. System Architecture Apache Hadoop up to 2.7.2 required only if

    YARN and HDFS are needed
 Apache HBase up to 1.2.4 Apache Spark up to 1.6.3
 for Hadoop 2.6 not Spark 2.x version Elasticsearch up to 1.7.5 not the Elasticsearch 2.x version
  14. Pros - Spark Ωϟογϡػೳͷಋೖ σʔλΛϝϞϦʹอ࣋ ৐Γ੾Βͳ͍৔߹͸σΟεΫʹు͖ग़͢ ػցֶशͰར༻͢ΔߦྻσʔλͳͲͰ͋Ε͹৐Γ੾Δ͜ͱ͕ଟ͍ RDD (Resilient Distributed

    Dataset) ॲཧର৅ͷσʔλɾηοτΛந৅Խͨ͠΋ͷ ো֐͕ൃੜͨ͠৔߹͸ετϨʔδ͔ΒḷΕΔ৘ใΛ͓࣋ͬͯΓ
 ϨδϦΤϯτʹઃܭ͞Ε͍ͯΔ ScalaͷΠϛϡʔλϒϧͳίϨΫγϣϯͰද͢
  15. D A S E D-A-S-E Data Source and Data Preparator

    Algorithm Serving Evaluation Metrics
  16. Algorithm • train() Λ࣮૷ • ༧ଌϞσϧͷֶशΛ୲౰͢Δ • pio train ίϚϯυͰݺͼग़͞ΕΔ

    • HDFSʢLocalFSʣʹετΞ͞ΕΔ • predict() Λ࣮૷ • σϓϩΠޙͷΫΤϦʹରͯ͠ϦΞϧλΠϜʹݺ͹ΕΔ
  17. Algorithm • P2LAlgorithm • Ϟσϧ͕γϦΞϥΠζ͞Εͯอଘ͞ΕΔ • PAlgorithm • RDDΛؚΜͩϞσϧ͕࡞ΒΕΔ৔߹ •

    Ϟσϧ͸IPersistentModelΛܧঝ • save()Λ࣮૷ʢWriteʣ • ίϯύχΦϯΦϒδΣΫτʹapply()Λ࣮૷ʢReadʣ