Slide 1

Slide 1 text

AI’s Impact on Art & Design Surrey Margaret Maynard-Reid, 10/26/2024

Slide 2

Slide 2 text

AI/ML GDE (Google Developer Expert) 3D artist Fashion Designer Instructor of MSIS, UW Foster Ex Bing, MSR & MS Design Studio About me margaretmz.art 2

Slide 3

Slide 3 text

Content ● Intro to GenAI ● GenAI use cases ● AI’s impact on art & design ● Future trends 3

Slide 4

Slide 4 text

Intro to GenAI

Slide 5

Slide 5 text

@margaretmz #DevFestSurrey #BuildWithAI What are Generative Models? “Generative models: take a machine, observe many samples from a distribution and generate more samples from that same distribution”. - Ian Goodfellow 2016 5

Slide 6

Slide 6 text

Type of Generative Models ● 2014 Generative Adversarial Networks (GANs) ● 2016 Autoregressive Models ● 2019 Variational autoencoders (VAEs) ● Flow-based models ● 2020 Diffusion models ● 2022 Diffusion Transformer 6 Source: Lilian Weng blog (link)

Slide 7

Slide 7 text

Generative Adversarial Networks (GANs) GANs have at least two network models which compete against each other ... 7

Slide 8

Slide 8 text

@margaretmz #DevFestSurrey #BuildWithAI 8

Slide 9

Slide 9 text

Diffusion Models 1. Gradually add gaussian noise to training data 2. Learn how to reverse the process to generate images from noise. 9 Source: Nvidia developer blog (link) Forward image diffusion Generative reverse denoise

Slide 10

Slide 10 text

CLIP: Contrastive Language-Image Pre-training CLIP is a bridge between NLP and computer vision, connecting text and Images It has a text encoder and image encoder, trained with 400 million image-text pairs. ● DALLE, DALLE-2 ● Stable Diffusion ● Imagen, Imagen 2, Imagen 3 Paper: Learning Transferable Visual Models From Natural Language Supervision 10

Slide 11

Slide 11 text

Diffusion Transformer Paper: Scalable Diffusion Models with Transformers SoTA models using diffusion transformer: ● Pixart-a ● SORA ● Stable Diffusion 3 11

Slide 12

Slide 12 text

The U-Net 12 U-Net architecture (image source: U-Net paper) 2015 paper for medical imaging segmentation Used in many popular GenAI models: ● Pix2Pix ● CycleGAN ● Diffusion Models (DDPM) ● DALL-E ● Midjourney ● Stable Diffusion…

Slide 13

Slide 13 text

U-Net vs Diffusion Transformer U-Net - not crucial to the good performance of the diffusion model - struggle with capturing long-range dependencies and global context in the input data Diffusion Transformer - More flexible - Can use more training data and larger model parameters - Transformers can model long-range dependencies without the need for deep networks or large filters, because of the self-attention mechanisms 13

Slide 14

Slide 14 text

@margaretmz #DevFestSurrey #BuildWithAI Generative AI timeline Source: Sora paper 14

Slide 15

Slide 15 text

GenAI Use Cases In Art & Design

Slide 16

Slide 16 text

GenAI Use Cases & Tools, in Art & Design ● Generate 2D images ● Generate 3D visuals ● Generate animations / Videos ● Generate Music User friendly -> low code -> dev tools 16 ChatGPT + DALLE-3

Slide 17

Slide 17 text

Imagen 3 ● Text-to-image model ○ High quality ○ Photo realistic ○ Great text rendering ● Released end of August 2024 ● Many options for you to try Imagen 3: ○ Try in on VertexAI ○ Try it on Gemini app ○ Try it on ImageFX https://deepmind.google/technologies/imagen-3/ Imagen 3 Paper: https://arxiv.org/abs/2408.07009 17

Slide 18

Slide 18 text

@margaretmz #DevFestSurrey #BuildWithAI Why is Imagen 3 better? ● Aesthetics a. Improved photorealism compared to DallE and Stable Diffusion b. Outstanding generating of images with multiple people ● Lower defects a. Better generation detailed objects and features such (ex. hands) b. Higher prompt alignment ● Lower Latency a. Imagen 3 Fast generates images in less than 4 seconds (for 4 images) and faster on average. This is faster than competitors. ● Text on Images a. Improved generation of text on images 18

Slide 19

Slide 19 text

Imagen 3 on Gemini ● Go to the Gemini app ○ web browser https://gemini.google.com/app ○ or mobile app ● Describe what you’d like in prompt ● After image is generated, ask Gemini to refine the image 19

Slide 20

Slide 20 text

Refine image generation with Gemini “A mood board with a few 1950's dresses with glamour and haute couture in bright colors” 20 “Add text “Era of 1950s to the bottom of the image” “Move the text to top of the image”

Slide 21

Slide 21 text

@margaretmz #DevFestSurrey #BuildWithAI Imagen 3 on ImageFX ● Sign up at labs.google ● Choose ImageFX ● Generate images ● User friendly UI ○ Change camera ○ Change style etc ● Edit images 21

Slide 22

Slide 22 text

@margaretmz #DevFestSurrey #BuildWithAI Edit - choose the image to edit 22

Slide 23

Slide 23 text

@margaretmz #DevFestSurrey #BuildWithAI Edit - change earrings to blue color 23

Slide 24

Slide 24 text

@margaretmz #DevFestSurrey #BuildWithAI Edit - earrings changed 24

Slide 25

Slide 25 text

@margaretmz #DevFestSurrey #BuildWithAI Imagen 3 on Vertex AI Vision ● Image generation ● Image Editing ● Can specify image size ● Fine tuning 25

Slide 26

Slide 26 text

26 Vertex AI Image Generation “Generate an image of a mood board with a few 1950's dresses with glamour and haute couture in bright colors of red, orange, yellow, pink and gold”

Slide 27

Slide 27 text

Edit - change foreground Click on “Extract / People Prompt = “Change the 3 dresses to soft pastel colors” 27

Slide 28

Slide 28 text

@margaretmz #DevFestSurrey #BuildWithAI Image Generation 28 Prompt = “A floral fabric pattern with beautiful cherry blossom” My 3D fashion design

Slide 29

Slide 29 text

@margaretmz #DevFestSurrey #BuildWithAI Vertex AI Image Editing: Box Mask 29

Slide 30

Slide 30 text

@margaretmz #GDGVancouver #BuildWithAI Demo: Diamond earrings generated! 30

Slide 31

Slide 31 text

Imagen 3 in Colab 1. Install Vertex AI SDK for Python 2. Restart runtime 3. Authenticate your notebook environment 4. Set Google Cloud project information and initialize Vertex AI SDK 5. Load image generation model:Imagen 3 or Imagen 3 Fast … 31

Slide 32

Slide 32 text

@margaretmz #DevFestSurrey #BuildWithAI NotebookLM ● Powered by Gemini 1.5 ● Can take multiple inputs: pdf files, URL links and video etc. ● Suggest questions forchatting with the LLM ● Can generate a podcast with 2-person conversation ● A great way to learn a topic ○ Create study guides ○ Brief docs ○ Podcasts 32

Slide 33

Slide 33 text

NotebookLM - demo Study Charles Frederick Worth - Father of Haute Couture (High Fashion) ● Go to http://notebooklm.google/ ● Create a new notebook ● Enter a URL to a website or YouTube video ● Use suggested prompts ● Type your own prompt ● Click on Generate under Notebook Guide/Audio Overview 33

Slide 34

Slide 34 text

VideoPoet A large language model for zero shot video generation Multimodal: ● Text to video ● Image to video ● Stylization ● Outpainting ● Stylization ● Video to audio Source: VideoPoet – Google Research 34

Slide 35

Slide 35 text

Lumiere Project page: https://lumiere-video.github.io/ ● Text-to-Video ● Image-to-Video ● Stylized generation ● Video stylization ● Cinemagraphs ● Inpainting Read the paper → [2401.12945] Lumiere: A Space-Time Diffusion Model for Video Generation 35 Video source: Lumiere

Slide 36

Slide 36 text

Veo & VideoFX ● Announced at I/O 2024 ● Generate video of ○ 1080 resolution ○ More than 60 seconds ● https://deepmind.google/technolo gies/veo/ ● Sign up for preview to try Veo in VideoFX 36

Slide 37

Slide 37 text

AI’s Impact on Art & Design Editable Location

Slide 38

Slide 38 text

How AI negatively impacts creatives? ● Job losses ● Copyright issues ● Privacy issues ● The AI ‘fatigue’ 38

Slide 39

Slide 39 text

“Is generative AI taking our jobs?” 39

Slide 40

Slide 40 text

@margaretmz #DevFestSurrey #BuildWithAI - Economist Richard Baldwin, 2023 World Economic Forum's Growth Summit 40 “AI won't take your job. It's somebody using AI that will take your job”

Slide 41

Slide 41 text

@margaretmz #DevFestSurrey #BuildWithAI AI’s Impact on Jobs ● Creative: inspirations ● Helper: assistant, co-pilot ● Partner: complementing, augmenting ● Competitor: displace and replace 41

Slide 42

Slide 42 text

Top 10 occupations image generator exposure score 42 Source: Occupational Heterogeneity in Exposure to Generative AI

Slide 43

Slide 43 text

Copyright issues 43

Slide 44

Slide 44 text

The AI Fatigue 10/16/2024 - Some Adobe MAX attendees are getting tired of the relentless focus on AI | Creative Bloq ● AI Overload: Adobe MAX attendees are weary of the event’s dominant focus on AI, seeing it as overtaking traditional design content at a design-centered conference. ● Mixed Reactions: Some welcome AI's efficiency; others worry it might overshadow creative originality and craftsmanship. ● Concerns: Creators fear AI could impact creative control, intellectual property, and job security. ● Potential Benefits: Supporters suggest AI could enhance productivity by handling repetitive tasks, freeing up time for creative work 44

Slide 45

Slide 45 text

Future Trends Editable Location

Slide 46

Slide 46 text

@margaretmz #DevFestSurrey #BuildWithAI 46 Future of AI in art and design ● Multimodal, not limited to just text or images ○ Animation ○ Video ○ 3D objects ● Multi-agent ● Integration of GenAI models into applications ● Efficient, smaller and on on-device: ● Easier customization by artists and designers, not just devs

Slide 47

Slide 47 text

@margaretmz #DevFestSurrey #BuildWithAI Thank you and keep in touch! Connect with me to learn more about AI, art & design! @margaretmz @margaretmz @margaretmz @margaretmz 47