Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AI's Impact on Art and Design - DevFest Surrey ...

AI's Impact on Art and Design - DevFest Surrey 2024

This is a talk presented at DevFest Surrey 2024.

Generative Al is rapidly transforming the landscape of art and design, challenging the traditional notion of creativity. The talk will demonstrate how we can use the latest multimodal GenAl tools such as Gemini, Imagen 3, Vertex Al and Al Studio etc throughout our design process. We will discuss both the exciting new opportunities and complex implications of GenAl's growing role in the creative fields. This talk is for anyone interested in the intersection of Al, art and design.

Margaret Maynard-Reid

October 26, 2024
Tweet

More Decks by Margaret Maynard-Reid

Other Decks in Technology

Transcript

  1. AI/ML GDE (Google Developer Expert) 3D artist Fashion Designer Instructor

    of MSIS, UW Foster Ex Bing, MSR & MS Design Studio About me margaretmz.art 2
  2. Content • Intro to GenAI • GenAI use cases •

    AI’s impact on art & design • Future trends 3
  3. @margaretmz #DevFestSurrey #BuildWithAI What are Generative Models? “Generative models: take

    a machine, observe many samples from a distribution and generate more samples from that same distribution”. - Ian Goodfellow 2016 5
  4. Type of Generative Models • 2014 Generative Adversarial Networks (GANs)

    • 2016 Autoregressive Models • 2019 Variational autoencoders (VAEs) • Flow-based models • 2020 Diffusion models • 2022 Diffusion Transformer 6 Source: Lilian Weng blog (link)
  5. Generative Adversarial Networks (GANs) GANs have at least two network

    models which compete against each other ... 7
  6. Diffusion Models 1. Gradually add gaussian noise to training data

    2. Learn how to reverse the process to generate images from noise. 9 Source: Nvidia developer blog (link) Forward image diffusion Generative reverse denoise
  7. CLIP: Contrastive Language-Image Pre-training CLIP is a bridge between NLP

    and computer vision, connecting text and Images It has a text encoder and image encoder, trained with 400 million image-text pairs. • DALLE, DALLE-2 • Stable Diffusion • Imagen, Imagen 2, Imagen 3 Paper: Learning Transferable Visual Models From Natural Language Supervision 10
  8. Diffusion Transformer Paper: Scalable Diffusion Models with Transformers SoTA models

    using diffusion transformer: • Pixart-a • SORA • Stable Diffusion 3 11
  9. The U-Net 12 U-Net architecture (image source: U-Net paper) 2015

    paper for medical imaging segmentation Used in many popular GenAI models: • Pix2Pix • CycleGAN • Diffusion Models (DDPM) • DALL-E • Midjourney • Stable Diffusion…
  10. U-Net vs Diffusion Transformer U-Net - not crucial to the

    good performance of the diffusion model - struggle with capturing long-range dependencies and global context in the input data Diffusion Transformer - More flexible - Can use more training data and larger model parameters - Transformers can model long-range dependencies without the need for deep networks or large filters, because of the self-attention mechanisms 13
  11. GenAI Use Cases & Tools, in Art & Design •

    Generate 2D images • Generate 3D visuals • Generate animations / Videos • Generate Music User friendly -> low code -> dev tools 16 ChatGPT + DALLE-3
  12. Imagen 3 • Text-to-image model ◦ High quality ◦ Photo

    realistic ◦ Great text rendering • Released end of August 2024 • Many options for you to try Imagen 3: ◦ Try in on VertexAI ◦ Try it on Gemini app ◦ Try it on ImageFX https://deepmind.google/technologies/imagen-3/ Imagen 3 Paper: https://arxiv.org/abs/2408.07009 17
  13. @margaretmz #DevFestSurrey #BuildWithAI Why is Imagen 3 better? • Aesthetics

    a. Improved photorealism compared to DallE and Stable Diffusion b. Outstanding generating of images with multiple people • Lower defects a. Better generation detailed objects and features such (ex. hands) b. Higher prompt alignment • Lower Latency a. Imagen 3 Fast generates images in less than 4 seconds (for 4 images) and faster on average. This is faster than competitors. • Text on Images a. Improved generation of text on images 18
  14. Imagen 3 on Gemini • Go to the Gemini app

    ◦ web browser https://gemini.google.com/app ◦ or mobile app • Describe what you’d like in prompt • After image is generated, ask Gemini to refine the image 19
  15. Refine image generation with Gemini “A mood board with a

    few 1950's dresses with glamour and haute couture in bright colors” 20 “Add text “Era of 1950s to the bottom of the image” “Move the text to top of the image”
  16. @margaretmz #DevFestSurrey #BuildWithAI Imagen 3 on ImageFX • Sign up

    at labs.google • Choose ImageFX • Generate images • User friendly UI ◦ Change camera ◦ Change style etc • Edit images 21
  17. @margaretmz #DevFestSurrey #BuildWithAI Imagen 3 on Vertex AI Vision •

    Image generation • Image Editing • Can specify image size • Fine tuning 25
  18. 26 Vertex AI Image Generation “Generate an image of a

    mood board with a few 1950's dresses with glamour and haute couture in bright colors of red, orange, yellow, pink and gold”
  19. Edit - change foreground Click on “Extract / People Prompt

    = “Change the 3 dresses to soft pastel colors” 27
  20. @margaretmz #DevFestSurrey #BuildWithAI Image Generation 28 Prompt = “A floral

    fabric pattern with beautiful cherry blossom” My 3D fashion design
  21. Imagen 3 in Colab 1. Install Vertex AI SDK for

    Python 2. Restart runtime 3. Authenticate your notebook environment 4. Set Google Cloud project information and initialize Vertex AI SDK 5. Load image generation model:Imagen 3 or Imagen 3 Fast … 31
  22. @margaretmz #DevFestSurrey #BuildWithAI NotebookLM • Powered by Gemini 1.5 •

    Can take multiple inputs: pdf files, URL links and video etc. • Suggest questions forchatting with the LLM • Can generate a podcast with 2-person conversation • A great way to learn a topic ◦ Create study guides ◦ Brief docs ◦ Podcasts 32
  23. NotebookLM - demo Study Charles Frederick Worth - Father of

    Haute Couture (High Fashion) • Go to http://notebooklm.google/ • Create a new notebook • Enter a URL to a website or YouTube video • Use suggested prompts • Type your own prompt • Click on Generate under Notebook Guide/Audio Overview 33
  24. VideoPoet A large language model for zero shot video generation

    Multimodal: • Text to video • Image to video • Stylization • Outpainting • Stylization • Video to audio Source: VideoPoet – Google Research 34
  25. Lumiere Project page: https://lumiere-video.github.io/ • Text-to-Video • Image-to-Video • Stylized

    generation • Video stylization • Cinemagraphs • Inpainting Read the paper → [2401.12945] Lumiere: A Space-Time Diffusion Model for Video Generation 35 Video source: Lumiere
  26. Veo & VideoFX • Announced at I/O 2024 • Generate

    video of ◦ 1080 resolution ◦ More than 60 seconds • https://deepmind.google/technolo gies/veo/ • Sign up for preview to try Veo in VideoFX 36
  27. How AI negatively impacts creatives? • Job losses • Copyright

    issues • Privacy issues • The AI ‘fatigue’ 38
  28. @margaretmz #DevFestSurrey #BuildWithAI - Economist Richard Baldwin, 2023 World Economic

    Forum's Growth Summit 40 “AI won't take your job. It's somebody using AI that will take your job”
  29. @margaretmz #DevFestSurrey #BuildWithAI AI’s Impact on Jobs • Creative: inspirations

    • Helper: assistant, co-pilot • Partner: complementing, augmenting • Competitor: displace and replace 41
  30. The AI Fatigue 10/16/2024 - Some Adobe MAX attendees are

    getting tired of the relentless focus on AI | Creative Bloq • AI Overload: Adobe MAX attendees are weary of the event’s dominant focus on AI, seeing it as overtaking traditional design content at a design-centered conference. • Mixed Reactions: Some welcome AI's efficiency; others worry it might overshadow creative originality and craftsmanship. • Concerns: Creators fear AI could impact creative control, intellectual property, and job security. • Potential Benefits: Supporters suggest AI could enhance productivity by handling repetitive tasks, freeing up time for creative work 44
  31. @margaretmz #DevFestSurrey #BuildWithAI 46 Future of AI in art and

    design • Multimodal, not limited to just text or images ◦ Animation ◦ Video ◦ 3D objects • Multi-agent • Integration of GenAI models into applications • Efficient, smaller and on on-device: • Easier customization by artists and designers, not just devs
  32. @margaretmz #DevFestSurrey #BuildWithAI Thank you and keep in touch! Connect

    with me to learn more about AI, art & design! @margaretmz @margaretmz @margaretmz @margaretmz 47