Make your data Grab-and-Go

Make your data Grab-and-Go

7dc8611c26c3ca62c551109c65d04270?s=128

Yuichiro Someya

May 17, 2017
Tweet

Transcript

  1. 9.

    ‣ Giving same datas as other’s have enough trouble #

    it may spans across multiple type of data sources
 ‣ Datas sometimes need to be strictly identical
  2. 10.
  3. 11.
  4. 12.

      Document on notebook (Sets of scripts, manual ops)

    ‣ Bothersome (to document / to use) ‣ Human-error Prone
  5. 13.
  6. 14.
  7. 15.

      keras.dataset.mnist Document on notebook (Sets of scripts, manual

    ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone
  8. 16.

     ‣ Easy, can instantly be reproduced  keras.dataset.mnist Document

    on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone
  9. 17.

     ‣ Easy, can instantly be reproduced ‣ Less chance

    to be used in real work  keras.dataset.mnist Document on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone
  10. 19.

     ‣ Easy, can instantly be reproduced ‣ Less chance

    to be used in real work  keras.dataset.mnist Document on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone
  11. 20.

     Preprocessing Batching Fetch Load  ‣ Load the data

    to script
 (or any other training dev) ‣ Convert, Reshape, Split, … ‣ Download datas and put it to a specific place
  12. 26.
  13. 27.
  14. 28.

     akagi ‣ Make it easier to access multiple types

    of Data Sources # MySQL, Amazon Redshift, Amazon S3, Google Spreadsheets, FTP Servers, … ‣ Specify the datas with runnable Python code # Use and Document at the same time 
  15. 29.

     akagi ‣ akagi introduces Abstract Layer on Datas #

    Have potential to apply common operations over them # Data registry ?