Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Make your data Grab-and-Go

Make your data Grab-and-Go

Yuichiro Someya

May 17, 2017
Tweet

More Decks by Yuichiro Someya

Other Decks in Programming

Transcript

  1. ‣ Giving same datas as other’s have enough trouble #

    it may spans across multiple type of data sources
 ‣ Datas sometimes need to be strictly identical
  2.   Document on notebook (Sets of scripts, manual ops)

    ‣ Bothersome (to document / to use) ‣ Human-error Prone
  3.   keras.dataset.mnist Document on notebook (Sets of scripts, manual

    ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone
  4.  ‣ Easy, can instantly be reproduced  keras.dataset.mnist Document

    on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone
  5.  ‣ Easy, can instantly be reproduced ‣ Less chance

    to be used in real work  keras.dataset.mnist Document on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone
  6.  ‣ Easy, can instantly be reproduced ‣ Less chance

    to be used in real work  keras.dataset.mnist Document on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone
  7.  Preprocessing Batching Fetch Load  ‣ Load the data

    to script
 (or any other training dev) ‣ Convert, Reshape, Split, … ‣ Download datas and put it to a specific place
  8.  akagi ‣ Make it easier to access multiple types

    of Data Sources # MySQL, Amazon Redshift, Amazon S3, Google Spreadsheets, FTP Servers, … ‣ Specify the datas with runnable Python code # Use and Document at the same time 
  9.  akagi ‣ akagi introduces Abstract Layer on Datas #

    Have potential to apply common operations over them # Data registry ?