20171209 Sakura ML Night

Slide 1

Slide 1 text

C-LIS CO., LTD.

Slide 2

Slide 2 text

C-LIS CO., LTD. ༗ࢁܓೋʢ,FJKJ"3*:"."ʣ $-*4$0 -5% "OESPJEΞϓϦ։ൃνϣοτσΩϧ Photo by Koji MORIGUCHI (MORIGCHOWDER) ػցֶश͸ͪΐͬͱ΍ͬͨ͜ͱ͋Γ·͢ Twitter΍ͬͯ·ͤΜ

Slide 3

Slide 3 text

͘͞ΒͷػցֶशφΠτ 5FOTPS'MPXͰ  /4'8ը૾ݕग़

Slide 4

Slide 4 text

5FOTPS'MPXʢ೥݄ൃදʣ ػց஌ೳ޲͚ܭࢉϑϨʔϜϫʔΫ ࠷৽όʔδϣϯʢ೥݄ʣ

Slide 5

Slide 5 text

ษڧձ΍Ζ͏ͥ

Slide 6

Slide 6 text

(PPHMF%FWFMPQFS(SPVQ

Slide 7

Slide 7 text

IUUQTHEHLPCFEPPSLFFQFSKQFWFOUT

Slide 8

Slide 8 text

Πϯλʔωοτ͔Β ޷Έͷը૾ΛࣗಈͰऩू͍ͨ͠

Slide 9

Slide 9 text

Slide 10

Slide 10 text

؟ڸ່ͬ൑ఆ 1 0

Slide 11

Slide 11 text

σʔληοτʢ೥݄࣌఺ʣ ؟ڸ່ͬɹຕ ඇ؟ڸ່ͬຕ ؟ڸ່ͬ ඇ؟ڸ່ͬ ޡݕग़ ؟ڸ່ͬ ඇ؟ڸ່ͬ

Slide 12

Slide 12 text

{ "generator": "Region Cropper", "file_name": "haruki_g17.png", "regions": [ { "probability": 1.0, "label": 2, "rect": { "left": 97.0, "top": 251.0, "right": 285.0, "bottom": 383.0 } }, { "probability": 1.0, "label": 2, "rect": { "left": 536.0, "top": 175.0, "right": 730.0, "bottom": 321.0 } } ] } Region Cropper: https://github.com/keiji/region_cropper

Slide 13

Slide 13 text

ߏ੒ Downloader σʔληοτ Region + Label ઃఆ rsync

Slide 14

Slide 14 text

ཧ૝ͷߏ੒ Downloader Face Detection Megane Detection ֬ೝɾमਖ਼ ೝࣝ݁Ռ ֶशʢ܇࿅ʣ λΠϜϥΠϯ ϝσΟΞ σʔληοτ ֶशʢ܇࿅ʣ TensorFlow rsync

Slide 15

Slide 15 text

௅ઓͷաఔΛಉਓࢽʹ

Slide 16

Slide 16 text

͞·͟·ͳ՝୊ σʔληοτ͕(#Λ௒͑ͨ͋ͨΓ͔ΒϩʔΧϧ΁ͷಉظ͕ࠔ೉ʹɻ ྖҬʢ3FHJPOʣͷઃఆͱϥϕϧͷ෇༩͸૝૾Ҏ্ʹෛՙ͕ߴ͍ɻ

Slide 17

Slide 17 text

ը૾͕ສຕΛಥഁ σʔλ੔ཧ͕ࢸٸͷ՝୊ʹ

Slide 18

Slide 18 text

໨ඪΛ࠶֬ೝ

Slide 19

Slide 19 text

Πϯλʔωοτ͔Β ޷Έͷ؟ڸ່ͬը૾ΛࣗಈͰऩू͍ͨ͠

Slide 20

Slide 20 text

Ҏલͷߏ੒ Downloader σʔληοτ Region + Label ઃఆ rsync

Slide 21

Slide 21 text

ྖҬʴϥϕϧ

Slide 22

Slide 22 text

৽͍͠ߏ੒ Downloader σʔληοτ Tagઃఆ

Slide 23

Slide 23 text

λά megane girl

Slide 24

Slide 24 text

؟ڸ່ͬ൑ผϞσϧ Ϟσϧ 1.00 0.00

Slide 25

Slide 25 text

%BUBTFU.BOBHFSGPS"OESPJE

Slide 26

Slide 26 text

σϞ

Slide 27

Slide 27 text

https://twitter.com/35s_00/status/930366666973757441

Slide 28

Slide 28 text

https://twitter.com/_meganeco

Slide 29

Slide 29 text

/4'8ʢ/PU4BGF'PS8PSLʣ

Slide 30

Slide 30 text

/4'8ը૾

Slide 31

Slide 31 text

͞·͟·ͳϦεΫ ࡞ۀͷϊΠζ ਫ਼ਆతͳෛՙ ๏తϦεΫ

Slide 32

Slide 32 text

/4'8ը૾ͷݕग़

Slide 33

Slide 33 text

ֶश༻σʔληοτʢ/4'8ʣ ਖ਼ྫɿ ෛྫɿ ← NSFWը૾

Slide 34

Slide 34 text

܇࿅ɾֶश

Slide 35

Slide 35 text

ڭࢣ༗Γֶश ○ × Ϟσϧ 1.00 0.00

Slide 36

Slide 36 text

Ϟσϧͷߏ଄ conv 3x3x64 stride 1 conv 3x3x64  stride 1 ReLU ReLU conv 3x3x128  stride 1 conv 3x3x128  stride 1 ReLU conv 3x3x256  stride 1 conv 3x3x256  stride 1 ReLU output 1 256x256x1 max_pool 2x2 stride 2 max_pool 2x2 stride 2 ReLU ReLU Sigmoid max_pool 2x2 stride 2 conv 3x3x64  stride 1 ReLU fc 768 ReLU bn bn bn

Slide 37

Slide 37 text

Sigmoid

Slide 38

Slide 38 text

# モデル定義 NUM_CLASSES = 1 NAME = 'model3' IMAGE_SIZE = 256 CHANNELS = 3 def prepare_layers(image, training=False): with tf.variable_scope('inference'): conv1 = tf.layers.conv2d(image, 64, [3, 3], [1, 1], padding='SAME', activation=tf.nn.relu, use_bias=False, trainable=training, name='conv1_1') conv1 = tf.layers.conv2d(conv1, 64, [3, 3], [1, 1], padding='VALID', activation=tf.nn.relu, use_bias=False, trainable=training, name='conv1_2') conv1 = tf.layers.batch_normalization(conv1, trainable=training, name='bn_1')

Slide 39

Slide 39 text

conv2 = tf.layers.conv2d(pool1, 128, [3, 3], [1, 1], padding='VALID', activation=tf.nn.relu, use_bias=False, trainable=training, name='conv2_1') conv2 = tf.layers.conv2d(conv2, 128, [3, 3], [1, 1], padding='VALID', activation=tf.nn.relu, use_bias=False, trainable=training, name='conv2_2') conv2 = tf.layers.batch_normalization(conv2, trainable=training, name='bn_2') pool2 = tf.layers.max_pooling2d(conv2, [2, 2], [2, 2])

Slide 40

Slide 40 text

conv3 = tf.layers.conv2d(pool2, 256, [3, 3], [1, 1], padding='VALID', activation=tf.nn.relu, use_bias=False, trainable=training, name='conv4_1') conv3 = tf.layers.conv2d(conv3, 256, [3, 3], [1, 1], padding='VALID', activation=tf.nn.relu, use_bias=False, trainable=training, name='conv4_2') conv3 = tf.layers.batch_normalization(conv3, trainable=training, name='bn_4') pool3 = tf.layers.max_pooling2d(conv3, [2, 2], [2, 2]) conv = tf.layers.conv2d(pool3, 64, [1, 1], [1, 1], padding='VALID', activation=tf.nn.relu, use_bias=True, trainable=training, name='conv') return conv

Slide 41

Slide 41 text

def output_layers(prev, batch_size, keep_prob=0.8, training=False): flatten = tf.reshape(prev, [batch_size, -1]) fc1 = tf.layers.dense(flatten, 768, trainable=training, activation=tf.nn.relu, name='fc1') fc1 = tf.layers.dropout(fc1, rate=keep_prob, training=training) output = tf.layers.dense(fc1, NUM_CLASSES, trainable=training, activation=None, name='output') return output

Slide 42

Slide 42 text

def _loss(logits, labels, batch_size, positive_ratio): cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits( labels=labels, logits=logits) loss = tf.reduce_mean(cross_entropy) return loss def _init_optimizer(learning_rate): return tf.train.AdamOptimizer(learning_rate=learning_rate) ޡࠩؔ਺ͱ࠷దԽΞϧΰϦζϜ

Slide 43

Slide 43 text

ֶशΛ্ख͘ਐΊΔ޻෉

Slide 44

Slide 44 text

ਖ਼ྫɾෛྫͷൺ཰ ਖ਼ྫɿ ෛྫɿ ← NSFWը૾ NSFW

Slide 45

Slide 45 text

def _hard_negative_mining(loss, labels, batch_size): positive_count = tf.reduce_sum(labels) positive_count = tf.reduce_max((positive_count, 1)) negative_count = positive_count * HARD_SAMPLE_MINING_RATIO negative_count = tf.reduce_max((negative_count, 1)) negative_count = tf.reduce_min((negative_count, batch_size)) positive_losses = loss * labels negative_losses = loss - positive_losses top_negative_losses, _ = tf.nn.top_k(negative_losses, k=tf.cast(negative_count, tf.int32)) loss = (tf.reduce_sum(positive_losses / positive_count) + tf.reduce_sum(top_negative_losses / negative_count)) return loss )BSE/FHBUJWF.JOJOH

Slide 46

Slide 46 text

ֶश؀ڥʢ͘͞ΒͷߴՐྗίϯϐϡʔςΟϯάʣ $169FPO$PSFʷ .FNPSZ(# 44%(# (F'PSDF(595*5"/9ʢ1BTDBMΞʔΩςΫνϟʣ(#ʷ (F'PSDF(595Jʢ1BTDBMΞʔΩςΫνϟʣ(#ʷ

Slide 47

Slide 47 text

ֶश৚݅ ޡࠩؔ਺ަࠩΤϯτϩϐʔ ࠷దԽΞϧΰϦζϜ"EBN ֶश཰ όοναΠζ

Slide 48

Slide 48 text

Slide 49

Slide 49 text

Slide 50

Slide 50 text

طଘͷσʔληοτʹਪ࿦ʢJOGFSFODFʣΛ࣮ߦ Downloader σʔληοτ Tagઃఆ inference trainer ֶशࡁΈϞσϧ ֶश༻σʔληοτ

Slide 51

Slide 51 text

ਪ࿦݁Ռ /4'8 Ұൠը૾ NSFW  8.6%

Slide 52

Slide 52 text

ֶश༻σʔληοτʢ/4'8ʣ ਖ਼ྫɿ ɹˠɹ ෛྫɿ ɹˠɹ

Slide 53

Slide 53 text

܇࿅ɾֶशʹ͔͔Δܭࢉ࣌ؒ

Slide 54

Slide 54 text

Slide 55

Slide 55 text

σϞ (16ɾ$16ͷൺֱ

Slide 56

Slide 56 text

$16ɾ(16ͷ଎౓ൺֱʢCBUDI4J[Fʣ 5*5"/9 TFDTUFQ 9FPO$PSFTFDTUFQ ࠓճͷϞσϧͷֶशʹ͍ͭͯ͸ 5*5"/9ͷํ͕ഒ଎͍ʂ

Slide 57

Slide 57 text

$16ɾ(16ͷ଎౓ൺֱʢCBUDI4J[Fʣ 5*5"/9 (595J TFDTUFQ 9FPO$PSFTFDTUFQ ࠓճͷϞσϧͷֶशʹ͍ͭͯ͸ (16ʷͷํ͕ഒ଎͍ʂ

Slide 58

Slide 58 text

ࠓޙͷ՝୊

Slide 59

Slide 59 text

σʔληοταʔόʔͷ৴པੑ޲্

Slide 60

Slide 60 text

JOGFSFODFʢਪ࿦ʣͷͨΊͷܭࢉࢿݯͷ֬อ Downloader σʔληοτ Tagઃఆ inference trainer ֶशࡁΈϞσϧ ֶश༻σʔληοτ

Slide 61

Slide 61 text

TAGS = [ 'original_art', 'nsfw', 'like', 'photo', 'illust', 'comic', 'face', 'girl', 'megane', ϥϕϧʢλάʣ 'school_uniform', 'blazer_uniform', 'sailor_uniform', 'gl', 'kemono', 'boy', 'bl', 'cat', 'dog', 'food', 'dislike', ]

Slide 62

Slide 62 text

.PWJEJVT

Slide 63

Slide 63 text

ਪ࿦Λ.PWJEJVT΁Ҡߦ Downloader σʔληοτ Tagઃఆ trainer ֶशࡁΈϞσϧ ֶश༻σʔληοτ inference

Slide 64

Slide 64 text

Ϋϥε൑ఆϞσϧ conv 3x3x64 stride 1 conv 3x3x64  stride 1 ReLU ReLU conv 3x3x128  stride 1 conv 3x3x128  stride 1 ReLU conv 3x3x256  stride 1 conv 3x3x256  stride 1 ReLU output 20 256x256x1 max_pool 2x2 stride 2 max_pool 2x2 stride 2 ReLU ReLU Sigmoid max_pool 2x2 stride 2 conv 3x3x64  stride 1 ReLU fc 768 ReLU bn bn bn

Slide 65

Slide 65 text

C-LIS CO., LTD. ຊࢿྉ͸ɺ༗ݶձࣾγʔϦεͷஶ࡞෺Ͱ͢ɻຊࢿྉͷશ෦ɺ·ͨ͸Ұ෦ʹ͍ͭͯɺஶ࡞ऀ͔ΒจॻʹΑΔڐ୚Λಘͣʹෳ੡͢Δ͜ͱ͸ې͡ΒΕ͍ͯ·͢ɻ 5IF"OESPJE4UVEJPJDPOJTSFQSPEVDFEPSNPEJpFEGSPNXPSLDSFBUFEBOETIBSFECZ(PPHMFBOEVTFEBDDPSEJOHUPUFSNTEFTDSJCFEJOUIF$SFBUJWF$PNNPOT"UUSJCVUJPO-JDFOTF ֤੡඼໊ɾϒϥϯυ໊ɺձ໊ࣾͳͲ͸ɺҰൠʹ֤ࣾͷ঎ඪ·ͨ͸ొ࿥঎ඪͰ͢ɻຊࢿྉதͰ͸ɺɺɺäΛׂѪ͍ͯ͠·͢ɻ 5IF"OESPJESPCPUJTSFQSPEVDFEPSNPEJpFEGSPNXPSLDSFBUFEBOETIBSFECZ(PPHMFBOEVTFEBDDPSEJOHUPUFSNTEFTDSJCFEJOUIF$SFBUJWF$PNNPOT"UUSJCVUJPO-JDFOTF https://speakerdeck.com/keiji/20171209-sakura-ml-night