Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up
for free
Google Cloud ML を用いた機械学習基盤の構築と運用/pepabo_ml_infrastructure_starchart
monochromegane
January 28, 2017
Technology
4
3.7k
Google Cloud ML を用いた機械学習基盤の構築と運用/pepabo_ml_infrastructure_starchart
GCPUG Fukuoka 5th 〜Machine Learning 祭〜
https://gcpugfukuoka.connpass.com/event/46049/
monochromegane
January 28, 2017
Tweet
Share
More Decks by monochromegane
See All by monochromegane
monochromegane
0
120
monochromegane
0
370
monochromegane
0
120
monochromegane
0
850
monochromegane
0
3.2k
monochromegane
0
170
monochromegane
0
3.3k
monochromegane
0
3.3k
monochromegane
0
210
Other Decks in Technology
See All in Technology
sumi
0
550
iqbocchi
0
540
oracle4engineer
0
2.7k
hsano
0
130
hhiroshell
7
470
smzksts
0
280
yasuakiomokawa
0
370
opdavies
0
1.6k
cmwatanabeseigo
0
360
suzukiry
0
210
torisoup
0
300
con_mame
4
2k
Featured
See All Featured
addyosmani
1348
190k
skipperchong
7
670
ufuk
56
5.4k
bryan
30
3.3k
3n
163
22k
holman
448
130k
dotmariusz
94
5.1k
chriscoyier
683
180k
sachag
267
17k
lauravandoore
11
1.3k
brad_frost
156
6.4k
colly
66
3k
Transcript
ࡾ༔հ / Pepabo R&D Institute, GMO Pepabo, Inc. 2017.01.28 GCPUG
Fukuoka 5th ʙMachine Learning ࡇʙ Google Cloud MLΛ༻͍ͨ ػցֶशج൫ͷߏஙͱӡ༻
ϓϦϯγύϧΤϯδχΞ ࡾ༔հ!NPOPDISPNFHBOF ϖύϘݚڀॴݚڀһ IUUQCMPHNPOPDISPNFHBOFDPN
ࠓ͓͢͠Δ͜ͱ •ϖύϘݚڀॴͱɺͳΊΒ͔ͳγεςϜ •ͳΊΒ͔ͳγεςϜΛ࣮ݱ͢Δػցֶशج൫ •Google Cloud ML ͱ StarChart Λ༻͍ͨػցֶशج ൫ͷӡ༻
https://icons8.com/
ϖύϘݚڀॴ
ϖύϘݚڀॴ ུশʮϖύݚʯ ɺࣄۀΛࠩ ผԽͰ͖Δٕज़Λ࡞Γग़ͨ͢ΊʹʮͳΊΒ͔ ͳγεςϜʯͱ͍͏ίϯηϓτͷԼͰݚڀ։ ൃʹऔΓΉ৫Ͱ͢ɻ ϖύϘݚڀॴʹ͍ͭͯ http://rand.pepabo.com/
ͳΊΒ͔ͳγεςϜ
γεςϜͷ֤ཁૉ͕໌ࣔతͳૢ࡞Λܦͣʹಛ Λೝࣝ͠ɺͦͷಛؔੑʹج͖ͮɺͦ ͷ࣌ʑͷঢ়گʹԠͨ͡࠷దͳαʔϏεΛఏڙ ͢Δ ͳΊΒ͔ͳγεςϜ http://rand.pepabo.com/
ػցֶशͱɺͳΊΒ͔ͳγεςϜ
ྫ͑
WebαʔϏεͷΞΫηεΛ༧ଌ͢Δ •ैྔ՝ۚͷԾϦιʔεӡ༻ʹ͓͍ͯ࠷దͳϦιʔεधཁͷ༧ ଌίετΧοτʹͭͳ͕Δ •WebαʔϏεͷϦιʔεधཁϦΫΤετॲཧ݅ɺͭ·ΓΞ Ϋηεͱ૬͕ؔ͋Δͣ •Ϧιʔεͷ૿ݮʹ͋Δఔͷ͕͔͔࣌ؒΔͨΊɺϦΞϧλΠ ϜͰͳ͘ҰఆִؒͰͷΞΫηε༧ଌͰेͱߟ͑Δ
WebαʔϏεͷΞΫηεΛ༧ଌ͢Δ ΞΫηεΛ༧ଌͰ͖ΔΑ͏ʹͳΕ ɺϐʔΫλΠϜʹ͋Θͤͨݟੵ Γ͔Β࣌ؒ͝ͱͷ࠷దʢͱࢥΘΕ ΔʣݟੵΓ͕ՄೳʹͳΔ
LSTM Long Short Term Memory
WebαʔϏεͷΞΫηεΛ༧ଌ͢Δ -45.Λ༻͍ͨΞΫηε༧ଌ w ೖྗظ ͷΞΫηεͱΧϨϯμʔใ w ग़ྗ༧ଌͨ͠ظઌͷΞΫηε w ظઌͷ༧ଌʹલճͷ༧ଌΛೖྗʹؚΊͳ͕Β
ظઌ·Ͱͷ༧ଌ w ࠨਤिؒΛ܁Γฦ͠༧ଌ IUUQSBOEQFQBCPDPNBSUJDMFNPOPDISPNFHBOF
ϖύݚ
ΞΧσϛοΫͳਫ४ʹ͓͚Δ৽نੑɾ༗ޮੑɾ ৴པੑΛٻ͢ΔݚڀΛߦ͏ͱͱʹɺݚڀ ։ൃٕͨ͠ज़Λ࣮ࡍͷγεςϜͱ࣮ͯ͠ɾ ఏڙ͢Δ͜ͱΛ௨ͯ͠ɺࣄۀͷʹߩݙ͠ ·͢ɻ ϖύϘݚڀॴʹ͍ͭͯ http://rand.pepabo.com/
Service meets ML. ͬͯφϯϘ
ػցֶशج൫ʹ ٻΊΒΕΔͷ
ྫ͑ɺΞΫηε༧ଌΛαʔϏεͰ͏ Users Service log rack-bigfoot Bigfoot Activity ML Platform training
dataset train input data prediction tune Access count prediction and schedule scaling
ྫ͑ɺΞΫηε༧ଌΛαʔϏεͰ͏ Users Service log rack-bigfoot Bigfoot Activity ML Platform training
dataset train input data prediction tune Access count prediction and schedule scaling ͍͔ͳΔಛΛ͔࣋ͭΛ ਫ਼៛ʹೝࣝ͢Δ ਓؒʹ໌ࣔతͳૢ࡞Λ ՝͞ͳ͍ ͦͷ࣌ʑͷঢ়گʹԠͯ͡ ࠷దͳαʔϏεΛఏڙ͢Δ
OK, ͳΊΒ͔
ػցֶशج൫ͷཁ݅Λߟ͑Δ ϩά%#ͳͲͷαʔϏεࢿ࢈ͱ࿈ܞͰ͖Δ ൺֱత༰қʹϞσϧͷߏஙͱࢼߦ͕ߦ͑Δ ֶश݁ՌΛར༻͢ΔͨΊͷखஈͱͯ͠"1*Λఏڙ͢Δ ֶश݁ՌͷϩʔΧϧར༻͕Ͱ͖Δͱͳ͓Α͍
্هͷΈ͕εέʔϥϒϧͰ͋Δ͜ͱ
Google Cloud ML Ͱߟ͑Δ ೖग़ྗ͕$MPVE4UPSBHFܦ༝ ܇࿅ϓϩάϥϜͱͯ͠5FOTPS'MPXΛ࠾༻ ΦϯϥΠϯ༧ଌαʔϏεʹΑΓϞσϧͷ"1*Խ
ֶश݁Ռ$MPVE4UPSBHFʹอଘɺϩʔΧϧͰͷར༻ ࢄܕͷτϨʔχϯάΠϯϑϥͱෛՙࢄαʔϏεͱͷ࿈ܞ ※ ݕ౼ʹؔ͢Δৄࡉ: http://rand.pepabo.com/article/2017/01/18/pepabo-ml-platform-and-workflow/
ػցֶशج൫Λ ӡ༻͢Δ
ֶश݁ՌӬଓతͳͷͰͳ͍ expose Create 1. ϞσϧΛߏஙͯ͠ެ։͢Δ 2. αʔϏεࢿ࢈ͷมԽʹ߹ΘͤͯϞσϧΛվળ͢Δ tune
ͦͷϞσϧɺαʔϏεʹఏڙͯ͠େৎʁ expose create ֶशͨ͠༰͕֬ೝͳ͘ద༻͞Εͯ͠·͏͜ͱͰҙਤ͠ͳ͍݁ Ռ͕ར༻͞ΕΔ͜ͱΛ͍͗ͨ tune Ϟσϧֶश͠ͳ͓ͨ͠ͷͰɺ APIͷ݁Ռ͕มΘͬͯ·͢ ਫ਼Լ͕ͬͯΔؾ͕͢Δʁ ग़ྗͷ࣍ݩมΘͬͯͳ͍ʁ
ͦͷϞσϧɺαʔϏεʹఏڙͯ͠େৎʁ expose create ֶशͨ͠༰͕֬ೝͳ͘ద༻͞Εͯ͠·͏͜ͱͰҙਤ͠ͳ͍݁ Ռ͕ར༻͞ΕΔ͜ͱΛ͍͗ͨ -> Ϟσϧͷόʔδϣϯཧ tune Ϟσϧֶशͯ͠৽όʔδϣϯ ͭͬͯ͘·͢ɻସ͍͍Ͱ͔͢ʁ
͍͖ͳΓมΘΒͳͯ͘ศར ͚ͩͲɺมߋ༰͕Θ͔ΒΜͳʁ default
ͦͷϞσϧɺαʔϏεʹఏڙͯ͠େৎʁ expose create ֶशͨ͠༰͕֬ೝͳ͘ద༻͞Εͯ͠·͏͜ͱͰҙਤ͠ͳ͍݁ Ռ͕ར༻͞ΕΔ͜ͱΛ͍͗ͨ -> Ϟσϧͷίʔυཧ tune Ϟσϧͷ৽όʔδϣϯɺมߋ༰ͷ ϨϏϡʔ͓ئ͍͠·͢ʂ
LGTM!!! όʔδϣϯΓସ͑·͢ʂʂ default management
StarChart https://github.com/monochromegane/starchart
StarChart is a tool to manage Google Cloud Machine Learning
training programs and model versions. StarChart
StarChart Train job, model default version Train programs and model
versions on GitHub train, expose, apply StarChart • όʔδϣϯཧͷସʹ͓͚Δஅج४ͱͳΔ܇࿅ϓϩάϥϜɺύϥϝλɺδϣϒ ใ·ͰؚΊͯίʔυͰཧ • ֶश࣌ͷδϣϒIDCloud Storageͷύεɺόʔδϣϯʹඥͮ͘ύϥϝλใͷऔ ಘʹ·ͭΘΔCloud MLͷࡉ͔ͳ͍উखվળ
Let’s try
DCGAN on Cloud ML using StarChart
DCGAN TensorFlowʹΑΔDCGANͰΞΠ υϧͷإը૾ੜ http://memo.sugyan.com/entry/20160516/1463359395
܇࿅ϓϩάϥϜΛGitཧ͢Δ . !"" dcgan #"" setup.py !"" trainer #"" __init__.py
#"" dcgan.py !"" task.py Ϟσϧ໊ͷԼʹύοέʔδߏʹͳ ΔΑ͏ʹ܇࿅ϓϩάϥϜΛஔ ґଘύοέʔδ͕͋Δ߹ɺ setup.pyΛ४උ ࠓճface-generatorͷdcgan.pyͱ main.pyΛར༻ https://github.com/sugyan/face-generator
܇࿅ϓϩάϥϜΛδϣϒͱͯ͠ొ $ starchart train \ -m dcgan \ # MODEL_NAME
-M trainer.task \ # MODULE_NAME -- \ --train_dir=TRAIN_PATH/model \ # YOUR_TRAIN_PARAMS --images_dir=TRAIN_PATH/images \ --data_dir=gs://$BUCKET_NAME/data/dcgan • ύοέʔδϯάɺCloud StorageͷΞοϓϩʔυɺδϣϒొΛ࣮ߦ • `TRAIN_PATH`Cloud Storage্ʹδϣϒ͝ͱʹ࡞͞ΕΔσΟϨΫτϦ໊ʹղऍ • ϓϩδΣΫτIDɺϦʔδϣϯɺΫϨσϯγϟϧdirenvܦ༝ͷڥมࢦఆ͕ศར
δϣϒͷ࣮ߦΛͭ $ starchart state -m dcgan jobId: dcgan_20170125191521 (FAILED) •
δϣϒIDͱεςʔλεΛ֬ೝ • ϩάදࣔػೳະ࣮
FAILED??
܇࿅ϓϩάϥϜΛ$MPVE.-ʹରԠͤ͞Δ import os os.listdir() os.mkdir(path) os.path.exists() with open(filename, 'wb') as
f: • δϣϒ࣮ߦ࣌ͷFileIOCloud StorageΛରͱ͢Δ • tensorflow.python.lib.io.file_ioύοέʔδΛ͏͜ͱͰϩʔΧϧύεࢦఆɺCloud Storageࢦఆ(gs://)Λಁաతʹѻ͑Δ • ύεࢦఆίϚϯυϥΠϯҾͰͤΔΑ͏࣮͓ͯ͘͠ͱศར from tensorflow.python.lib.io import file_io file_io.list_directory() file_io.create_dir(path) file_io.file_exists() with file_io.FileIO(filename, 'w') as f:
܇࿅ϓϩάϥϜΛδϣϒͱͯ͠ొ $ starchart train \ -m dcgan \ # MODEL_NAME
-M trainer.task \ # MODULE_NAME -- \ --train_dir=TRAIN_PATH/model \ # YOUR_TRAIN_PARAMS --images_dir=TRAIN_PATH/images \ --data_dir=gs://$BUCKET_NAME/data/dcgan
δϣϒͷ࣮ߦΛͭ $ starchart state -m dcgan jobId: dcgan_20170125194440 (SUCCESSED) jobId:
dcgan_20170125191521 (FAILED) • δϣϒIDͱεςʔλεΛ֬ೝ
ϞσϧΛެ։͢Δ $ starchart expose -m dcgan • ޭͨ͠δϣϒΛݩʹϞσϧΛొ • Ϟσϧొ࣌ʹόʔδϣϯొͯ͠༧ଌαʔϏεAPIͱͯ͠ެ։
• όʔδϣϯ໊ v + δϣϒ໊ model v20170125194440 (default) Cloud ML & Storage
Not working…
༧ଌαʔϏε"1*ͷΈʢٖࣅίʔυʣ request_params = {'instances': [{'sample_inputs': np.zeros((1, 40)).tolist()}]} def feed_from_request(request, tensor_keys):
feed = {} request_keys = request['instances'][0].keys() for key in request_keys: feed[tensor_keys[key]] = [instance[key] for instance in request['instances']] return feed with tf.Session() as sess: new_saver = tf.train.import_meta_graph(‘TRAIN_PATH/model/export.meta’) new_saver.restore(sess, ‘TRAIN_PATH/model/export’) tensor_keys = json.loads(tf.get_collection('inputs')[0]) feed = feed_from_request(request_params, tensor_keys) op = json.loads(tf.get_collection('outputs')[0]) result = sess.run(op, feed_dict=feed) print(result) ֶश݁Ռͷ.FUB(SBQIΛ෮ݩ ίϨΫγϣϯ JOQVUT ͷςϯι ϧͱ"1*ϦΫΤετύϥϝλΛ ඥ͚ ίϨΫγϣϯ PVUQVUT ͷςϯ ιϧΛΦϖϨʔγϣϯͱͯ͠ඥ ͚ͨGFFEΛҾʹ࣮ߦ
܇࿅ϓϩάϥϜΛ"1*ʹରԠͤ͞Δ saver.save(sess, os.path.join(FLAGS.train_dir, 'export')) • APIͰར༻͢ΔͨΊɺ࠷ऴͷֶश݁ՌΛΤΫεϙʔτ͢Δ • ΤΫεϙʔτ໊exportͰͳ͚ΕͳΒͳ͍ •
StarChartͷ߹ɺ`TRAIN_PATH/model/export` ͱͯ͠ग़ྗ͠ͳ͚ΕͳΒͳ͍
• ίϨΫγϣϯʹೖྗ༻ςϯιϧΛՃɻग़ྗςϯιϧͷ࣮ߦ࣌ʹfeed_dictͱͯ͠ ͢ύϥϝλΛࢦఆ • ίϨΫγϣϯʹग़ྗ༻ςϯιϧΛՃɻ༧ଌαʔϏεAPI࣮ߦ࣌ͷΦϖϨʔγϣϯ Λࢦఆ • ग़ྗ༻ςϯιϧʹ `tf.image.encode_jpeg`Λ͏ͱInternal Server
Error ͩͬͨͷͰtf.reshape(tf.squeeze(image, [0]), [1, -1]) ͱͨ͠ # Input sample_inputs = tf.placeholder(tf.float32, shape=(None, 1, dcgan.z_dim)) tf.add_to_collection('inputs', json.dumps({'sample_inputs': sample_inputs.name})) # Output sample_outputs = dcgan.sample_image_vectors(1, 1, inputs=sample_inputs[0]) tf.add_to_collection('outputs', json.dumps({'sample_outputs': sample_outputs.name})) ܇࿅ϓϩάϥϜΛ"1*ʹରԠͤ͞Δ
܇࿅ϓϩάϥϜΛδϣϒͱͯ͠ొ $ starchart train \ -m dcgan \ # MODEL_NAME
-M trainer.task \ # MODULE_NAME -- \ --train_dir=TRAIN_PATH/model \ # YOUR_TRAIN_PARAMS --images_dir=TRAIN_PATH/images \ --data_dir=gs://$BUCKET_NAME/data/dcgan
δϣϒͷ࣮ߦΛͭ $ starchart state -m dcgan jobId: dcgan_20170125201233 (SUCCESSED) jobId:
dcgan_20170125194440 (SUCCESSED) jobId: dcgan_20170125191521 (FAILED) • δϣϒIDͱεςʔλεΛ֬ೝ
ϞσϧΛެ։͢Δ $ starchart expose -m dcgan model v20170125194440 (default) Cloud
ML & Storage v20170125201233
ϞσϧϑΝΠϧΛGitཧ͢Δ . #"" dcgan $ #"" setup.py $ !"" trainer
$ #"" __init__.py $ #"" dcgan.py $ !"" task.py !"" dcgan.json exposeͨ݁͠Ռ͕`Ϟσϧ໊.json`ʹอ ଘ͞ΕΔɻσϑΥϧτόʔδϣϯͷ ସʹ͏ͷͰ͜ΕGitཧͱ͢Δ
ϞσϧϑΝΠϧΛGitཧ͢Δ { "model": "MODEL_NAME", "versions": [ { "version": { "name":
"projects/PROJECT_ID/models/MODEL_NAME/versions/v20170111170842", "deploymentUri": "gs://PROJECT_ID-ml/MODEL_NAME/20170111170842/model", "createTime": "2017-01-11T09:12:54Z", "job": { "jobId": "MODEL_NAME_20170111170842", "trainingInput": { "packageUris": [ "gs://PROJECT_ID-ml/MODEL_NAME/20170111170842/packages/trainer-0.0.0.tar.gz" ], "pythonModule": "trainer.task", "args": [ "--model_dir=gs://PROJECT_ID-ml/MODEL_NAME/20170111170842/model", "--train_dir=gs://PROJECT_ID-ml/MODEL_NAME/20170111170842/train", ], "region": "us-central1" }, "createTime": "2017-01-11T08:08:49Z", "startTime": "2017-01-11T08:13:55Z", "endTime": "2017-01-11T08:40:55Z", "state": "SUCCEEDED", "trainingOutput": { "consumedMLUnits": 0.45 } }, "isDefault": true } } ] } "1*όʔδϣϯʹඥͮ͘δϣϒ ࣮ߦ࣌ύϥϝλɺσϑΥϧτ όʔδϣϯ͔Ͳ͏͔֬ೝͰ͖Δ
༧ଌαʔϏε"1*Λ͏ project = 'project-123456' model = 'dcgan' version = 'v20170125194440'
credentials = GoogleCredentials.get_application_default() ml = discovery.build('ml', 'v1beta1', credentials=credentials) body = {'instances': [{'sample_inputs': np.zeros((1, 40)).tolist()}]} request = ml.projects().predict(name='projects/{}/models/{}/versions/{}'.format(project, model, version), body=body) try: response = request.execute() output = response['predictions'][0]['sample_outputs'] with tf.Session() as sess: image = sess.run(tf.image.encode_jpeg(tf.reshape(tf.constant(output, dtype=tf.uint8), [96, 96, 3]))) with file_io.FileIO('out.jpg', 'w') as f: f.write(image) except errors.HttpError as err: print(err._get_reason())
It works !!
ϞσϧͷσϑΥϧτόʔδϣϯΛมߋ͢Δ $ starchart apply -m dcgan model v20170125194440 Cloud ML
& Storage v20170125201233 (default) • ϞσϧϑΝΠϧΛฤू͠ɺσϑΥϧτͱ͍ͨ͠όʔδϣϯͷ `isDefault` Λ true ʹɻ • ͜ͷ࣌ͷ܇࿅ϓϩάϥϜͱϞσϧϑΝΠϧΛPullRequestͱ͢Δ • ϨϏϡʔͰσϑΥϧτόʔδϣϯͷج४Λຬ͍ͨͯͨ͠ΒϚʔδͯ͠apply
4UBS$IBSUʹΑΔӡ༻ͷ͓͞Β͍ •Ϟσϧ͝ͱͷ܇࿅ϓϩάϥϜΛGitཧ •train -> expose -> ϨϏϡʔ -> apply Λ܁Γฦ͢
Easy & Useful
·ͱΊ
·ͱΊ •ػցֶशΛαʔϏεར༻͢ΔͨΊʹػցֶशج൫Λݕ ౼ͨ͠ •Google Cloud MLʹΑΔߏஙɺӡ༻ΛߦͬͯΈͨ •ෆศͳStarChartͰվળͨ͠ •ػցֶशͰαʔϏεվળ͠Α͏ʂʂ
͓ΘΓ
ϖύϘΧϨοδظੜืूத ʙԬͰ׆༂͍ͨ͠ʂ8FCΞϓϦέʔγϣϯΤϯδχΞʙ ࠷৽ͷ࠾༻ใΛνΣοΫˠ !QC@SFDSVJU