$30 off During Our Annual Pro Sale. View Details »

Can Neural Networks Make Me a Better Parent?

Andrew Hao
October 04, 2019

Can Neural Networks Make Me a Better Parent?

When nighttime descends, our household becomes a battleground of sleep battles with our toddler (a total bummer!) How can building a TensorFlow-powered cry-detection baby monitor help me understand my little one?

If you are a beginner or just curious about machine learning, this talk is for you. Together, we’ll go on a ML discovery journey to train a TensorFlow model to recognize and quantify the cries of my little one. We’ll see how it works by walking through a keyword-spotting CNN described from a Google research paper, then see how it’s deployed on a Raspberry Pi. In the process, you’ll see how simple it is to train and deploy an audio recognition model!

Will the model deliver on its promise to deliver sleep training insights to this sleep-deprived parent? And what can training a model on human inputs teach us about building production models in the real world?

Given at PyGotham 2019

Andrew Hao

October 04, 2019
Tweet

More Decks by Andrew Hao

Other Decks in Programming

Transcript

  1. Can Neural Networks Make
    Me a Better Parent?
    A tale in three acts
    Andrew Hao
    @andrewhao
    [email protected]

    View Slide

  2. Act I
    Night & Chaos

    View Slide

  3. put the little one down

    View Slide

  4. View Slide

  5. View Slide

  6. helplessness
    frustration
    anger
    despair

    View Slide

  7. I need help
    I need data

    View Slide

  8. sanity needed
    Photo by Joshua Sortino on Unsplash

    View Slide

  9. Act II
    Enter the Machine

    View Slide

  10. Parts List
    • Raspberry Pi
    • USB microphone
    • DHT22 temp/humidity

    sensor

    View Slide

  11. $ arecord --device=hw:1,0 --format S16_LE --rate 22050 -c1 -
    d 10 "${RECORDING_FILE}"
    $ sox -V3 ${RECORDING_FILE} -n stats 2>&1 | grep dB
    Pk lev dB -57.05
    RMS lev dB -67.10
    RMS Pk dB -64.94
    RMS Tr dB -68.26

    View Slide

  12. View Slide

  13. Deep learning &
    neural nets

    View Slide

  14. View Slide

  15. convolutional neural network
    https://www.fastcompany.com/3037882/how-flickrs-deep-
    learning-algorithms-see-whats-in-your-photos

    View Slide

  16. CNNs
    Karpathy, Andrej "Convolutional Neural Networks (CNNs / ConvNets)" (http://cs231n.github.io/convolutional-networks/)

    View Slide

  17. audio spectrogram = image
    “yes” “Sheila” “wah, wah, wah”
    frequency
    ‏ time
    frequency
    ‏ time
    frequency
    ‏ time

    View Slide

  18. View Slide

  19. (fingerprint_input)
    v
    [Conv2D]<-(weights)
    v
    [BiasAdd]<-(bias)
    v
    [Relu]
    v
    [MaxPool]
    v
    [Conv2D]<-(weights)
    v
    [BiasAdd]<-(bias)
    v
    [Relu]
    v
    [MaxPool]
    v
    [MatMul]<-(weights)
    v
    [BiasAdd]<-(bias)
    v
    ‘cnn-trad-fpool3’
    architecture

    View Slide

  20. The Secret Life of ML Models
    Clean
    data

    Test
    model

    Get
    data

    Label
    data

    Train
    model

    Deploy
    model

    View Slide

  21. • Record data
    continuously in a
    cronjob
    • Upload to the cloud
    Get
    data








    View Slide

  22. Clean
    data

    • Align to a frame
    • Normalize
    volume
    # Concatenate
    sox crying/**/*.wav tmp/batch-crying.wav

    View Slide


  23. Clean
    data

    • Align to a frame
    • Normalize
    volume

    # Split
    ffmpeg -i tmp/batch-crying.wav -
    f segment -segment_time 10.1 -c

    View Slide

  24. Clean
    data

    • Align to a frame
    • Normalize
    volume
    # Trim to exactly 10000ms, boost 45
    dB, resample @ 22050 hz
    sox $in data/crying/$out vol 45 dB
    trim 0 10 rate 22050


    View Slide

  25. Using
    EchoML
    to
    classify
    data
    Label
    data

    View Slide

  26. • Bucket each
    audio sample in a
    folder by label





    crying dog_barking city_noise


    whining

    Label
    data

    View Slide

  27. python app/train.py
    --data_url=
    --data_dir=./data
    --wanted_words=room_empty,whining,crying
    --sample_rate=22050
    --clip_duration_ms=10000
    --how_many_training_steps=400,50
    --train_dir=./training
    Train
    model

    View Slide

  28. Train
    model

    View Slide

  29. Train
    model

    View Slide

  30. Test
    model

    View Slide

  31. Freeze the model
    $ python app/freeze.py
    --start_checkpoint=./training/conv.ckpt-450
    --output_file=./graph.pb
    --clip_duration_ms=10000
    --sample_rate=22050
    --wanted_words=white_noise,room_empty,crying
    --data_dir=./data
    $ cp training/conv_labels.txt .
    Deploy
    model

    View Slide

  32. Execute the model in prod
    Deploy
    model

    > crying (score = 0.95350)
    room_empty (score = 0.01888)
    _silence_ (score = 0.01746)
    $ arecord --format S16_LE --rate 22050 -c1
    -d 10 $wav
    $ python app/label_wav.py
    --graph=./graph.pb
    --labels=./conv_labels.txt
    --wav=$wav

    View Slide

  33. Will it work?

    View Slide

  34. Yes!

    View Slide

  35. ..yes?

    View Slide

  36. hmm.

    View Slide

  37. View Slide

  38. Act III
    The Illusion of Insight

    View Slide

  39. View Slide

  40. We just looked at each other

    View Slide

  41. View Slide

  42. .
    Systems,
    Models,
    Metrics,
    Funnels


    0
    1
    2
    3

    View Slide

  43. When we lose insights about the humans we serve,
    It’s ultimately bad for business

    View Slide

  44. In a Big Data world,
    Let’s re-emphasize the human
    experience

    View Slide

  45. .
    Systems,
    Models,
    Metrics,
    Funnels


    0
    1
    2
    3

    View Slide

  46. Are our assumptions right?

    How do we find out?

    View Slide

  47. Human-centered
    UX research
    Anthropology
    Lean Startup: Customer interviews

    View Slide

  48. Human-centered ML?

    View Slide

  49. Personas
    Photo by Toa Heftiba on Unsplash
    Jenna
    - Customer
    - Recent college graduate
    - Aspiring indie electronica
    producer
    - Wants to pay off college
    loans
    - Challenges: …
    Photo by Philip Martin on Unsplash
    Reeve
    - Internal stakeholder
    - Director, Digital Marketing
    - Concerned about ad budget
    spend
    Photo by Jhon David on Unsplash Lydia
    - Customer
    - Civil engineer
    - Mother of one & caretaker of her
    aging parents
    - Wants to save for her daughter’s
    college fund
    - Challenges: …

    View Slide

  50. We need human-centered
    design at all stages of the ML
    lifecycle

    View Slide

  51. Let’s start by connecting with
    people first

    View Slide

  52. Thank you
    github.com/andrewhao/babblefish
    Andrew Hao
    @andrewhao
    [email protected]

    View Slide