Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to Data Science for PHP Users

Introduction to Data Science for PHP Users

PHPカンファレンス2013「PHPerのためのデータサイエンス入門」 #phpcon2013

Sotaro Karasawa

September 14, 2013
Tweet

More Decks by Sotaro Karasawa

Other Decks in Technology

Transcript

  1. TD Web Server Web Server fluentd S3 Hadoop Client Hive

    MySQL etc... Result ͋ͬͪଆʹσʔλ͕ஷ·ΓɺΫΤ ϦΛ౤͛Δͱ͋ͬͪͰ)BEPPQ ͕ىಈͯ݁͠ՌΛฦͯ͘͠ΕΔ
  2. foreach (file('app.log') as $line) { $column = explode("\t", trim($line)); $time

    = $column[0]; $status = $column[1]; ... } ˞࣮ࡍʹ͸1)1ͳΜ͔Ͱ΍ͬͯΒΕͳ͍ͷͰTFE΍BXLͰ
  3. tags: - { name: kernel.event_listener, event: kernel.response } public function

    onKernelResponse(FilterResponseEvent $event) { $request = $event->getRequest(); $response = $event->getResponse(); // ͳΜ͔഑ྻͭͬͯ͘ $data = $this->onAccess($request, $response); // log data $this->logger->post("access",$data); } ˞࣮ࡍʹ͸΋ͬͱෳ਺ͷ-JTUFOFS΍-PHHFS͕ొ࿥Ͱ͖ΔΑ͏ʹͯ͋͠Γ·͕͢
  4. جຊతͳεΩʔϚ ௥ՃͷεΩʔϚ UJNF TUBUVT VSJ VB SFGFSSFS  ͳΜͪΌΒ ͔ΜͪΌΒ

    ಛఆͷϨίʔυʹɺಛผ ͳҙຯΛ΋ͨͤΔ͜ͱ͕Ͱ ͖Δʂ ͔͠΋ଞͷϨίʔυʹӨڹ Λ͋ͨ͑Δ͜ͱͳ͘ɻ
  5. τϥϯβΫγϣϯ෼ੳͷྫ SELECT item_id, ref, COUNT(*) FROM access WHERE key_action =

    'shop:buy:completed' GROUP BY item_id, ref ˞จࣈ਺ͷؔ܎্W<>ল͍ͯΔ
  6. ϩάΛूΊͨΓ෼ੳͨ͠Γ͢Δͷ͸େม ɹ→ Fluentd ΍ Hadoop ࢖͏ ɹ→ Treasure Data ࢖͏

    Ͳ͏͍͏ϩάΛूΊΕ͹͍͍ͷ͔ ɹ→ 1ΞΫηε1Ϩίʔυඇਖ਼نԽϩά ɹ→ ϩάϑΥʔϚοτࣗମͷઃܭ ɹ→ εΩʔϚϨεͷ׆༻