Slide 1

Slide 1 text

Make your data Grab-’n’-Go ayemos @ Cookpad Inc.

Slide 2

Slide 2 text

 `whoami` ‣ ‘Yuichiro Someya’.split.last.reverse.downcase ‣ github.com/ayemos ‣ twitter.com/ayemos_y ‣ www.ayemos.me 

Slide 3

Slide 3 text

  NFEJVNDPN!BZFNPT BZFNPTNF

Slide 4

Slide 4 text

  NFEJVNDPN!BZFNPT BZFNPTNF

Slide 5

Slide 5 text

github.com/ayemos/akagi

Slide 6

Slide 6 text

Make your Data Grab-’n’-Go

Slide 7

Slide 7 text

Make your Data Grab-’n’-Go *Data Reproducibility*

Slide 8

Slide 8 text

*Data Reproducibility* ‣Important ‣Not easy to achieve

Slide 9

Slide 9 text

‣ Giving same datas as other’s have enough trouble # it may spans across multiple type of data sources
 ‣ Datas sometimes need to be strictly identical

Slide 10

Slide 10 text

 

Slide 11

Slide 11 text

 

Slide 12

Slide 12 text

  Document on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone

Slide 13

Slide 13 text

 

Slide 14

Slide 14 text

 

Slide 15

Slide 15 text

  keras.dataset.mnist Document on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone

Slide 16

Slide 16 text

 ‣ Easy, can instantly be reproduced  keras.dataset.mnist Document on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone

Slide 17

Slide 17 text

 ‣ Easy, can instantly be reproduced ‣ Less chance to be used in real work  keras.dataset.mnist Document on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone

Slide 18

Slide 18 text

Levels of Data Abstraction

Slide 19

Slide 19 text

 ‣ Easy, can instantly be reproduced ‣ Less chance to be used in real work  keras.dataset.mnist Document on notebook (Sets of scripts, manual ops) ‣ Bothersome (to document / to use) ‣ Human-error Prone

Slide 20

Slide 20 text

 Preprocessing Batching Fetch Load  ‣ Load the data to script
 (or any other training dev) ‣ Convert, Reshape, Split, … ‣ Download datas and put it to a specific place

Slide 21

Slide 21 text

  Preprocessing Batching Fetch Load

Slide 22

Slide 22 text

  Preprocessing Batching Fetch Load keras.dataset.mnist

Slide 23

Slide 23 text

  Preprocessing Batching Fetch Load keras.dataset.mnist What I (or we) need

Slide 24

Slide 24 text

  Preprocessing Batching Fetch Load BLBHJ

Slide 25

Slide 25 text

 There might be a demo 

Slide 26

Slide 26 text

 

Slide 27

Slide 27 text

 

Slide 28

Slide 28 text

 akagi ‣ Make it easier to access multiple types of Data Sources # MySQL, Amazon Redshift, Amazon S3, Google Spreadsheets, FTP Servers, … ‣ Specify the datas with runnable Python code # Use and Document at the same time 

Slide 29

Slide 29 text

 akagi ‣ akagi introduces Abstract Layer on Datas # Have potential to apply common operations over them # Data registry ?