Make your data Grab-and-Go

Make your data Grab-’n’-Go ayemos @ Cookpad Inc.

`whoami` ‣ ‘Yuichiro Someya’.split.last.reverse.downcase ‣ github.com/ayemos ‣ twitter.com/ayemos_y ‣
www.ayemos.me

NFEJVNDPN!BZFNPT BZFNPTNF

github.com/ayemos/akagi

Make your Data Grab-’n’-Go

Make your Data Grab-’n’-Go *Data Reproducibility*

*Data Reproducibility* ‣Important ‣Not easy to achieve

‣ Giving same datas as other’s have enough trouble #
it may spans across multiple type of data sources  ‣ Datas sometimes need to be strictly identical

Document on notebook (Sets of scripts, manual ops)
‣ Bothersome (to document / to use) ‣ Human-error Prone

keras.dataset.mnist Document on notebook (Sets of scripts, manual
ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone

‣ Easy, can instantly be reproduced keras.dataset.mnist Document
on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone

‣ Easy, can instantly be reproduced ‣ Less chance
to be used in real work keras.dataset.mnist Document on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone

Levels of Data Abstraction

‣ Easy, can instantly be reproduced ‣ Less chance
to be used in real work keras.dataset.mnist Document on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone

Preprocessing Batching Fetch Load ‣ Load the data
to script  (or any other training dev) ‣ Convert, Reshape, Split, … ‣ Download datas and put it to a speciﬁc place

Preprocessing Batching Fetch Load

Preprocessing Batching Fetch Load keras.dataset.mnist

Preprocessing Batching Fetch Load keras.dataset.mnist What I (or
we) need

Preprocessing Batching Fetch Load BLBHJ

There might be a demo

akagi ‣ Make it easier to access multiple types
of Data Sources # MySQL, Amazon Redshift, Amazon S3, Google Spreadsheets, FTP Servers, … ‣ Specify the datas with runnable Python code # Use and Document at the same time

akagi ‣ akagi introduces Abstract Layer on Datas #
Have potential to apply common operations over them # Data registry ?

Make your data Grab-and-Go

Make your data Grab-and-Go

Yuichiro Someya

More Decks by Yuichiro Someya

Other Decks in Programming

Featured

Transcript

Make your data Grab-’n’-Go ayemos @ Cookpad Inc.

`whoami` ‣ ‘Yuichiro Someya’.split.last.reverse.downcase ‣ github.com/ayemos ‣ twitter.com/ayemos_y ‣

NFEJVNDPN!BZFNPT BZFNPTNF

NFEJVNDPN!BZFNPT BZFNPTNF

github.com/ayemos/akagi

Make your Data Grab-’n’-Go

Make your Data Grab-’n’-Go Data Reproducibility

Data Reproducibility ‣Important ‣Not easy to achieve

‣ Giving same datas as other’s have enough trouble #

Document on notebook (Sets of scripts, manual ops)

keras.dataset.mnist Document on notebook (Sets of scripts, manual

‣ Easy, can instantly be reproduced keras.dataset.mnist Document

‣ Easy, can instantly be reproduced ‣ Less chance

Levels of Data Abstraction

‣ Easy, can instantly be reproduced ‣ Less chance

Preprocessing Batching Fetch Load ‣ Load the data

Preprocessing Batching Fetch Load

Preprocessing Batching Fetch Load keras.dataset.mnist

Preprocessing Batching Fetch Load keras.dataset.mnist What I (or

Preprocessing Batching Fetch Load BLBHJ

There might be a demo

akagi ‣ Make it easier to access multiple types

akagi ‣ akagi introduces Abstract Layer on Datas #