Museum of Machine Arts

Beyond Face Rotation

MOMA (Museum of Machine Arts) - Sourabh Bajaj

Hello • I’m Sourabh

Hello • I’m Sourabh • I’m a software Engineer

Hello • I’m Sourabh • I’m a software Engineer •
I tweet @sb2nov

Hello • I’m Sourabh • I’m a software Engineer •
I tweet @sb2nov • I eat gummy bears

Generative Neural Networks The network learns to map input to
output by seeing many examples Output Text Image Video Music Audio 3D Actions Other Input Random Noise Topic Text Image Video Music Audio 3D Actions Other Generation / Translation Network

Object Detection and Recognition (ImageNet) googleresearch.blogspot.com/2014/09/ building-deeper-understanding-of-image s.html (Szegedy et
al., GoogLeNet) Live: • VGG • YOLO • YOLO v2 • LeCun Concurrence, Localization Occlusion Out of context Counting Tracking

Automatic Colorization • Larsson et al., people.cs.uchicago.edu/~larsson/colorization • IIzuka et
al., hi.cs.waseda.ac.jp/~iizuka/projects/colorization/en • Web interface: demos.algorithmia.com/colorize-photos (richzhang.github.io/colorization) Ground Truth Input Output

Denoising, Super-Resolution and Inpainting •Denoising auto-encoders (e.g. Chollet) •Recursive convolutional
net for super-resolution, Kim et al. • Code: github.com/alexjc/neural-enhance • Deepsense.ai blogpost • letsenhance.io •Context encoders for inpainting Pathak et al. • Globally and Locally Consistent Image Completion • Adobe DeepFill • Contextual Attention - demo

Inceptionism: Deep Dreams Mordvintsev et al., googleresearch.blogspot.com/2015/06/inceptionism-going-deeper-into-neural.html (GoogLeNet) Pareidolia: Perceiving
a familiar pattern where none exists. Johnny 5

Deep Dreams • Code: github.com/google/deepdream (see also: bat-county) • Online
TensorFlow example • Reddit: reddit.com/r/deepdream • Web interface: • dreamdeeply.com • deepdreamgenerator.com • psychic-vr-lab.com/deepdream • deepdream.akkez.ru • Captions: www.cs.toronto.edu/~rkiros/inceptionism_captions.html • Video: • Fear and Loathing in Las Vegas • Forest Trail

Gatys et al., arxiv.org/abs/1508.06576 (VGG) • Features from object recognition!
(*) • Web/Mobile interfaces: • dreamscopeapp.com, app, paintbrush (app, video) • ostagram.ru • deepart.io (app, video, fixed colors) • photopaint.us • instapainting.com/assets • aristo (app, video) • prisma-ai.com (app, video) • lucid (app, video) • deepdreamgenerator.com • deeparteffects.com (app) •Kogan (Alice video, Cubist Mirror) • 2001: Odyssey • Video: 1 2 3 Artistic Style Transfer

Style Transfer Li and Wand, (code) arxiv.org/abs/1601.04589 Luan et al.,
(code) arxiv.org/abs/1703.07511

Neural Doodle: Semantic Style Transfer Champandard, arxiv.org/abs/1603.01768 • Online demo:
dmitryulyanov.github.io/feed-forward-neural-doodle • Code: github.com/alexjc/neural-doodle • Blog: nucl.ai/blog/neural-doodles • See also: Wentz, github.com/awentzonline/image-analogies

Image to Image Translation With Conditional Adversarial Networks (PatchGAN) Isola
et al., phillipi.github.io/pix2pix Interactive: affinelayer.com/pixsrv Guide: ml4a.github.io/guides/Pix2Pix fotogenerator.npocloud.nl

Unsupervised Image to Image Translation (DiscoGAN/CycleGAN/DualGAN) • Kim et al.,
arxiv.org/abs/1703.05192 • Zhu et al., junyanz.github.io/CycleGAN • Yi et al., arxiv.org/abs/1704.02510 Face-off

Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
(StackGAN) Zhang et al., arxiv.org/abs/1612.03242 Code: github.com/hanzhanggit/StackGAN Cha et al., arxiv.org/abs/1708.09321 StackGAN++: Zhang et al., arxiv.org/abs/1710.10916 Code: github.com/hanzhanggit/StackGAN-v2

AttnGAN Xu et al., arxiv.org/abs/1711.10485 [supp]

AttnGAN – real results (github.com/taoxugit/AttnGAN) The sun is setting into
the sea A man riding a Horse The cat is sitting on the couch

Video Prediction and Generation • Deep multi-scale video prediction beyond
mean square error (Mathieu et al.) • Generating Videos with Scene Dynamics (Vondrick et al.) • Learning to Generate Long-term Future via Hierarchical Prediction (Villegas et al.) • Attentive Semantic Video Generation using Captions (Marwah et al.) • Video Generation from Text (Li et al.) • Visual to Sound: Generating Natural Sound for Videos in the Wild (Zhou et al.) • Imagine This! Scripts to Compositions to Videos (Gupta et al.)

Mikolov et al., arxiv.org/abs/1301.3781 Word Embeddings (Word2Vec) King Queen Man
Woman King + ( Woman – Man ) = Queen King - Man + Woman = Queen Y X Web demos: • rare-technologies.com/ word2vec-tutorial • bionlp-www.utu.fi/wv_demo Semantically: Algebraically:

Word Embeddings (Word2Vec)

Sutskever et al., arxiv.org/abs/1409.3215 Sentence Embeddings (sequence-to-sequence encoder-decoder LSTM)

Continuous Sentence Representation with Variational Autoencoders •Bowman et al., Generating
Sentences from a Continuous Space (1511.06349) •Semeniuta et al., A Hybrid Convolutional Variational Autoencoder for Text Generation (1702.02390) •Web demo: robinsloan.com/voyages-in-sentenc e-space

Thanks

Museum of Machine Arts

Museum of Machine Arts

Sourabh

More Decks by Sourabh

Other Decks in Technology

Featured

Transcript

Beyond Face Rotation

MOMA (Museum of Machine Arts) - Sourabh Bajaj

Hello • I’m Sourabh

Hello • I’m Sourabh • I’m a software Engineer

Hello • I’m Sourabh • I’m a software Engineer •

Hello • I’m Sourabh • I’m a software Engineer •

Generative Neural Networks The network learns to map input to

Object Detection and Recognition (ImageNet) googleresearch.blogspot.com/2014/09/ building-deeper-understanding-of-image s.html (Szegedy et

Automatic Colorization • Larsson et al., people.cs.uchicago.edu/~larsson/colorization • IIzuka et

Denoising, Super-Resolution and Inpainting •Denoising auto-encoders (e.g. Chollet) •Recursive convolutional

Inceptionism: Deep Dreams Mordvintsev et al., googleresearch.blogspot.com/2015/06/inceptionism-going-deeper-into-neural.html (GoogLeNet) Pareidolia: Perceiving

Deep Dreams • Code: github.com/google/deepdream (see also: bat-county) • Online

Gatys et al., arxiv.org/abs/1508.06576 (VGG) • Features from object recognition!

Style Transfer Li and Wand, (code) arxiv.org/abs/1601.04589 Luan et al.,

Neural Doodle: Semantic Style Transfer Champandard, arxiv.org/abs/1603.01768 • Online demo:

Image to Image Translation With Conditional Adversarial Networks (PatchGAN) Isola

Unsupervised Image to Image Translation (DiscoGAN/CycleGAN/DualGAN) • Kim et al.,

Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

AttnGAN Xu et al., arxiv.org/abs/1711.10485 [supp]

AttnGAN – real results (github.com/taoxugit/AttnGAN) The sun is setting into

Video Prediction and Generation • Deep multi-scale video prediction beyond

Mikolov et al., arxiv.org/abs/1301.3781 Word Embeddings (Word2Vec) King Queen Man

Word Embeddings (Word2Vec)

Sutskever et al., arxiv.org/abs/1409.3215 Sentence Embeddings (sequence-to-sequence encoder-decoder LSTM)

Continuous Sentence Representation with Variational Autoencoders •Bowman et al., Generating

Thanks