Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Slides, thesis dissertation defense, deep gener...

Mehdi
January 26, 2018

Slides, thesis dissertation defense, deep generative neural networks for novelty generation

In recent years, significant advances made in deep neural networks enabled the creation
of groundbreaking technologies such as self-driving cars and voice-enabled
personal assistants. Almost all successes of deep neural networks are about prediction,
whereas the initial breakthroughs came from generative models. Today,
although we have very powerful deep generative modeling techniques, these techniques
are essentially being used for prediction or for generating known objects
(i.e., good quality images of known classes): any generated object that is a priori
unknown is considered as a failure mode (Salimans et al., 2016) or as spurious
(Bengio et al., 2013b). In other words, when prediction seems to be the only
possible objective, novelty is seen as an error that researchers have been trying hard
to eliminate. This thesis defends the point of view that, instead of trying to eliminate
these novelties, we should study them and the generative potential of deep nets
to create useful novelty, especially given the economic and societal importance of
creating new objects in contemporary societies. The thesis sets out to study novelty
generation in relationship with data-driven knowledge models produced by
deep generative neural networks. Our first key contribution is the clarification of
the importance of representations and their impact on the kind of novelties that
can be generated: a key consequence is that a creative agent might need to rerepresent
known objects to access various kinds of novelty. We then demonstrate
that traditional objective functions of statistical learning theory, such as maximum
likelihood, are not necessarily the best theoretical framework for studying novelty
generation. We propose several other alternatives at the conceptual level. A second
key result is the confirmation that current models, with traditional objective
functions, can indeed generate unknown objects. This also shows that even though
objectives like maximum likelihood are designed to eliminate novelty, practical
implementations do generate novelty. Through a series of experiments, we study
the behavior of these models and the novelty they generate. In particular, we propose
a new task setup and metrics for selecting good generative models. Finally,
the thesis concludes with a series of experiments clarifying the characteristics of
models that can exhibit novelty. Experiments show that sparsity, noise level, and
restricting the capacity of the net eliminates novelty and that models that are better
at recognizing novelty are also good at generating novelty

Mehdi

January 26, 2018
Tweet

More Decks by Mehdi

Other Decks in Science

Transcript

  1. / 56 Deep generative neural networks for novelty generation: a

    foundational framework, metrics and experiments 1 Mehdi Cherti LAL/CNRS, Université Paris Saclay Supervised by: - Balàzs Kégl (LAL/CNRS, Université Paris Saclay) - Akın Kazakçı(Mines Paristech)
  2. / 56 6 Design theory • Early work: (Simon, 1969,

    1973), Design as ‘problem solving’ (i.e. moving from an initial state to a desired state) • C-K theory: (Hatchuel et al., 2003), Design as joint expansion of knowledge and concepts • Various formalisms of knowledge (Set Theory (Hatchuel et al, 2007), Graphs (Kazakci et al, 2010), Matroids (Le Masson et al, 2017))
  3. / 56 • Through C-K, it acknowledges that knowledge is

    central • But lacks computer-based experimental tools 7 Design theory
  4. / 56 9 • Enables experimentation but the end- goal

    is the object itself rather than studying the generative process • Fitness function barrier • No representation learning • Generation and evaluation are disconnected Computational creativity
  5. / 56 11 but these powerful models are used to

    regenerate objects that we can relate easily to known objects…
  6. / 56 • Although trained for generating what we know,

    some models can generate unrecognizable objects • However, these models and samples are considered as spurious (Bengio et al. 2013), or as a failure(Salimans et al. 2016) 12
  7. / 56 14 • Goal of the thesis: study generative

    potential of deep generative networks (DGNs) for novelty generation • Research questions: • What novelty can be generated by DGN? • How to evaluate the generative potential of a DGN? • What are the general characteristics of DGN that can generate novelty? • Method: We use computer based simulations with deep generative models because • They offer a rich and powerful set of existing techniques • They can learn (i.e. representations of objects) • Their generative potential has not been studied systematically
  8. / 56 Outline 1. Introduction 2. The impact of representations

    on novelty generation 3. Results 3.a. Studying the generative potential of a deep net 3.b. Evaluating the generative potential of deep nets 3.c. Characteristics of models that can generate novelty 4. Conclusion and perspectives 15
  9. / 56 2.The impact of representations on novelty generation 17

    (Reich, 1995) In design literature, it has been acknowledged that objects can be represented in multiple ways What effect do representations have on novelty generation ?
  10. / 56 • Suppose we have a dataset of 16

    letters 18 2.The impact of representations on novelty generation
  11. / 56 • Suppose we represent images in pixel space

    • We generate pixels randomly uniformly 19 Everything is new, but no structure 2.The impact of representations on novelty generation
  12. / 56 • Suppose we re-represent each letter using strokes

    20 • For instance, 2.The impact of representations on novelty generation
  13. / 56 22 Pixel space Stroke space Representations change what

    you can generate 2.The impact of representations on novelty generation
  14. / 56 •How do we choose a “useful” representation for

    novelty generation ? •Machine learning, and deep generative models in particular, provides ways to learn representations from data Q: Can we use those learned representations for generation of novelty even if these models are not designed to do so ? 23 2.The impact of representations on novelty generation
  15. / 56 24 2.The impact of representations on novelty generation

    •Noise vs novelty •Likelihood •Compression of representations Summary:
  16. / 56 25 • What novelty can be generated by

    deep generative nets (DGN)? • How to evaluate the generative potential of a DGN? • What are the general characteristics of DGN that can generate novelty? Research questions:
  17. / 56 27 • We observed that some models could

    generate novelty although not designed to do that • Thus, deep generative models have an unused generative potential • Can we demonstrate this more systematically ? 3.a. Studying the generative potential of a deep net
  18. / 56 29 (Kazakci, Cherti, Kégl, 2016) Train data Generative

    model Learn Generate ?? 3.a. Studying the generative potential of a deep net
  19. We use a convolutional sparse auto-encoder as a model Sparsity

    Training objective is to minimize the reconstruction error 30 3.a. Studying the generative potential of a deep net / 56
  20. Reconstruction Input (dim 625) Bottleneck Encode Decode Deep autoencoder with

    a bottleneck from Hinton, G. E., & Salakhutdinov, R. R. (2006). 3.a. Studying the generative potential of a deep net
  21. / 56 • We use an iterative method to generate

    new images • Start with a random image • Force the network to construct (i.e. interpret) • , until convergence, f(x) = dec(enc(x)) 32 3.a. Studying the generative potential of a deep net
  22. / 56 37 Known Training digits Representable “Combinations of strokes”

    37 3.a. Studying the generative potential of a deep net Our interpretation of the results:
  23. / 56 38 Known Training digits Representable All digits that

    the model can generate Valuable All recognizable digits 3.a. Studying the generative potential of a deep net
  24. / 56 39 Known Training digits Representable “Combinations of strokes”

    Valuable Human selection 3.a. Studying the generative potential of a deep net
  25. / 56 • We have one example of a deep

    generative model that can indeed generate novelty • Can we go further by automatically finding models that can generate novelty ? 40 3.b. Evaluating the generative potential of deep nets
  26. / 56 41 We designed a new setup and set

    of metrics to find models that are capable of generating novelty 3.b. Evaluating the generative potential of deep nets
  27. / 56 •Training on known classes •Testing on classes known

    to the experimenter but unknown to the model 42 Idea: simulate the unknown by 3.b. Evaluating the generative potential of deep nets Proposed setup: train on digits and test on letters, where letters are used as a proxy for evaluating the capacity of models to generate novelty
  28. / 56 43 Generative model Learn Generate Q: How many

    of those are letters ? 3.b. Evaluating the generative potential of deep nets
  29. / 56 44 Discriminator Learn 3.b. Evaluating the generative potential

    of deep nets To count letters, we learn a discriminator with 36 classes = 10 for digits + 26 for letters
  30. / 56 45 Discriminator Nb. of letters Predict 3.b. Evaluating

    the generative potential of deep nets We then use the discriminator to score the models:
  31. / 56 46 3.b. Evaluating the generative potential of deep

    nets “Nb of letters” score is a proxy for finding models that generate images that are : • non trivial • non recognizable as digits
  32. / 56 • We do a large scale experiment where

    we train ~1000 models (autoencoders, GANs) by varying their hyperparameters. • From each model, we generate 1000 images, then we evaluate the model using our proposed metrics • Question we tried to answer: 47 Can we find models that can generate novelty ? 3.b. Evaluating the generative potential of deep nets
  33. / 56 • Selecting models for letters count lead to

    models that can generate novelty 48 • Selecting models for digits count lead to models that memorize training classes 3.b. Evaluating the generative potential of deep nets
  34. / 56 50 Known Training digits Representable “Combinations of strokes”

    Valuable Letters 3.b. Evaluating the generative potential of deep nets
  35. / 56 51 We have shown that we can automatically

    find models that can generate novelty, as well as other models that cannot 3.b. Evaluating the generative potential of deep nets
  36. / 56 52 • Can we characterize the difference between

    models that can generate novelty and models that cannot ? • We study a particular model architecture through a series of experiments 3.b. Evaluating the generative potential of deep nets
  37. / 56 53 • We study the effect of different

    ways of restricting the capacity of the representation on the same architecture • We find that restricting the capacity of the representation hurts their ability to generate novelty 3.c.Characteristics of models that can generate novelty More capacity More novelty
  38. / 56 54 Conclusion Main Contributions: • Importance of representation

    on novelty generation • Current models can generate novelty even though not designed for that • We propose a new setup and a set of metrics to assess the capacity of the models to generate novelty • We show that constraining the capacity of the representation can be harmful for novelty generation
  39. / 56 55 Perspectives: immediate next steps • Explain why

    existing models can generate novelty • Propose an explicit training criterion to learn a representation suitable for novelty generation • Propose alternatives generative procedures to random sampling • Experiment on more complex datasets and domains
  40. / 56 56 • Agent evolving over time: dynamic knowledge

    and value function • Multi-agent system so that agents get/give feedback and cooperate Perspectives: future