Building smarter apps with machine learning, from magic to reality

Building smarter apps with machine learning, from magic to reality

“Any sufficiently advanced technology is indistinguishable from magic.”
— Arthur C Clarke

Well, machine learning can look like magic, but you don't need to be a data scientist or an ML researcher to develop with ML.

So, what about making your solution smarter without any knowledge in AI? With pre-trained models and a few lines of code, machine learning APIs can analyze your data. Moreover, AutoML techniques can now help in getting even more specific insights tailored to your needs.

In this session, you’ll see how to transform or extract information from text, image, audio & video with the latest ML APIs, how to train an AutoML custom model, and you’ll be an active player of a live demo. Don't put your smartphone in airplane mode!

7e6a435700dda6cdb22105d601f20892?s=128

Laurent Picard

April 25, 2020
Tweet

Transcript

  1. 1.

    Remote Python Pizza April 25, 2020 Laurent Picard @PicardParis Building

    smarter solutions with no expertise in machine learning
  2. 2.

    Who are we? ◦ Developer Advocate ‒ Google Cloud ◦

    Ebook pioneer ◦ Cofounder of Bookeen ◦ Developers? ◦ Machine learning users? ◦ Cloud users?
  3. 6.

    @PicardParis Origin Trying to mimic how (we think) our brain

    works How Using many examples to find answers Result Solving problems without explicitly knowing the answer How does deep learning work?
  4. 8.

    @PicardParis Exponential use of deep learning at Google Used across

    products: - Android - Apps - Gmail - Maps - Photos - Speech - Search - Translation - YouTube - ... Unique project directories
  5. 11.

    @PicardParis Ready-to-use models Translation API Speech-To-Text API Natural Language API

    Vision API Video Intelligence API Text-To-Speech API Image Video Text Text Speech Text Info Info Info Translation Text Speech
  6. 13.

    @PicardParis Label detection "labelAnnotations": [ { "description": "Nature", "mid": "/m/05h0n",

    "score": 0.9516123, }, { "description": "Flower", "mid": "/m/0c9ph5", "score": 0.91467637, }, { "description": "Garden", "mid": "/m/0bl0l", "score": 0.903375, }, … ]
  7. 14.

    @PicardParis Landmark detection "landmarkAnnotations": [ { "boundingPoly": {…}, "description": "Hobbiton

    Movie Set", "locations": [ { "latLng": { "latitude": -37.8723441, "longitude": 175.6833613 } } ], "mid": "/m/012r3jqg", "score": 0.61243546 } ]
  8. 15.

    @PicardParis Object detection "localizedObjectAnnotations": [ { "boundingPoly": {…}, "mid": "/m/01g317",

    "name": "Person", "score": 0.90216154 }, { "boundingPoly": {…}, "mid": "/m/01g317", "name": "Person", "score": 0.88069034 }, { "boundingPoly": {…}, "mid": "/m/01g317", "name": "Person", "score": 0.86947715 }, … ]
  9. 16.

    @PicardParis Face detection "faceAnnotations": [{ "detectionConfidence": 0.93634903, "boundingPoly": {…}, "fdBoundingPoly":

    {…}, "landmarkingConfidence": 0.18798567, "landmarks": [{ "type": "LEFT_EYE" "position": {…}, },…], "panAngle": -1.7626401, "rollAngle": 7.024975, "tiltAngle": 9.038818, "angerLikelihood": "LIKELY", "joyLikelihood": "VERY_UNLIKELY", "sorrowLikelihood": "VERY_UNLIKELY", "surpriseLikelihood": "VERY_UNLIKELY", "headwearLikelihood": "VERY_UNLIKELY", "blurredLikelihood": "VERY_UNLIKELY", "underExposedLikelihood": "VERY_UNLIKELY" }]
  10. 17.

    @PicardParis Text detection "fullTextAnnotation": { "pages": […], "text": " J.R.R.

    Tolkien > Quotes > Quotable Quote \"Three Rings for the Elven-kings under the… Seven for the Dwarf-lords in their halls of… Nine for Mortal Men, doomed to die, One for the Dark Lord on his dark throne In the Land of Mordor where the Shadows lie. One Ring to rule them all, One Ring to find… One Ring to bring them all and in the darkn… In the Land of Mordor where the Shadows lie.” - J. R. R. Tolkien, The Lord of the Rings " }
  11. 18.

    @PicardParis Text detection "fullTextAnnotation": { "pages": […], "text": " J.R.R.

    Tolkien > Quotes > Quotable Quote “Three Rings for the Elven-kings under the… Seven for the Dwarf-lords in their halls of… Nine for Mortal Men, doomed to die, One for the Dark Lord on his dark throne In the Land of Mordor where the Shadows lie. One Ring to rule them all, One Ring to find… One Ring to bring them all and in the darkn… In the Land of Mordor where the Shadows lie.' - J. R. R. Tolkien, The Lord of the Rings " }
  12. 19.

    @PicardParis Handwriting detection "fullTextAnnotation": { "pages": […], "text": " The

    Lord of the Rings Three Rings for the Elven-kings under the… Seven for the Dwarf-lords in their halls… Nine for Mortal Men doomed to die One for the Dark Lord on his dark throne In the 'Land of Mordor where the shadows lie. One Ring to rule them all, One Ring to find… One Ring to bring them all and in the shadovs… In the Land of Mordor where the shadows lie\" " }
  13. 20.

    @PicardParis Web entity detection and image matching "webDetection": { "bestGuessLabels":

    [ { "label": "jrr tolkien", "languageCode": "es" } ], "webEntities": [ { "entityId": "/m/041h0", "score": 14.976, "description": "J. R. R. Tolkien" },… ], "partialMatchingImages": [ { "url": "http://e00-elmundo.uecdn.es/…jpg" },… ], "pagesWithMatchingImages": […], "visuallySimilarImages": […] }
  14. 21.

    @PicardParis OSS client libraries from google.cloud import vision uri_base =

    'gs://cloud-vision-codelab' pics = ('face_surprise.jpg', 'face_no_surprise.png') client = vision.ImageAnnotatorClient() image = vision.types.Image() for pic in pics: image.source.image_uri = f'{uri_base}/{pic}' response = client.face_detection(image=image) for face in response.face_annotations: likelihood = vision.enums.Likelihood(face.surprise_likelihood) vertices = [f'({v.x},{v.y})' for v in face.bounding_poly.vertices] print(f'Face surprised: {likelihood.name}') print(f'Face bounds: {",".join(vertices)}') Python package: pypi.org/project/google-cloud-vision Tutorial code: codelabs.developers.google.com/codelabs/cloud-vision-api-python
  15. 24.

    @PicardParis Benefits Label Detection Detect entities within the video, such

    as "dog", "flower" or "car". Enable Video Search You can now search your video catalog the same way you search text documents. Insights from Videos Extract actionable insights from video files without requiring any machine learning or computer vision knowledge. More… Detect sequences. Detect adult content. Automatically transcribes video content in English (BETA - more languages to come).
  16. 27.

    @PicardParis OSS client libraries from google.cloud import videointelligence from google.cloud.videointelligence

    import enums, types def track_objects(video_uri, segments=None): video_client = videointelligence.VideoIntelligenceServiceClient() features = [enums.Feature.OBJECT_TRACKING] context = types.VideoContext(segments=segments) print(f'Processing video "{video_uri}"...') operation = video_client.annotate_video(input_uri=video_uri, features=features, video_context=context) return operation.result() Python package: pypi.org/project/google-cloud-videointelligence Tutorial code: codelabs.developers.google.com/codelabs/cloud-video-intelligence-python3
  17. 29.

    @PicardParis Syntax analysis Tolkien was a British writer, poet, philologist,

    and university professor who is best known as the author of the classic high-fantasy works The Hobbit, The Lord of the Rings, and The Silmarillion.
  18. 30.

    @PicardParis Tolkien was a British writer, poet, philologist, and university

    professor who is best known as the author of the classic high-fantasy works The Hobbit, The Lord of the Rings, and The Silmarillion. { "language": "en" } Syntax analysis
  19. 31.

    @PicardParis Entity detection Tolkien was a British writer, poet, philologist,

    and university professor who is best known as the author of the classic high-fantasy works The Hobbit, The Lord of the Rings, and The Silmarillion.
  20. 32.

    @PicardParis Entity detection Tolkien was a British writer, poet, philologist,

    and university professor who is best known as the author of the classic high-fantasy works The Hobbit, The Lord of the Rings, and The Silmarillion.
  21. 33.

    @PicardParis Entity detection Tolkien was a British writer, poet, philologist,

    and university professor who is best known as the author of the classic high-fantasy works The Hobbit, The Lord of the Rings, and The Silmarillion. { "name": "British", "type": "LOCATION", "metadata": { "mid": "/m/07ssc", "wikipedia_url": "https://en.wikipedia.org/wiki/United_Kingdom" } } { "name": "Tolkien", "type": "PERSON", "metadata": { "mid": "/m/041h0", "wikipedia_url": "https://en.wikipedia.org/wiki/J._R._R._Tolkien" } } { "name": "The Silmarillion", "type": "WORK_OF_ART", "metadata": { "mid": "/m/07c4l", "wikipedia_url": "https://en.wikipedia.org/wiki/The_Silmarillion" } }
  22. 34.

    @PicardParis Content classification Tolkien was a British writer, poet, philologist,

    and university professor who is best known as the author of the classic high-fantasy works The Hobbit, The Lord of the Rings, and The Silmarillion. { "categories": [ { "name": "/Books & Literature", "confidence": 0.97 }, { "name": "/People & Society/Subcultures…", "confidence": 0.66 }, { "name": "/Hobbies & Leisure", "confidence": 0.58 } ] }
  23. 35.

    @PicardParis Sentiment analysis 2 example reviews of “The Hobbit”: -

    Positive from the NYT (1938) - Negative from GoodReads
  24. 36.

    @PicardParis OSS client libraries from google.cloud import language from google.cloud.language

    import enums, types def analyze_text_sentiment(text): client = language.LanguageServiceClient() document = types.Document(content=text, type=enums.Document.Type.PLAIN_TEXT) response = client.analyze_sentiment(document=document) sentiment = response.document_sentiment results = [('text', text), ('score', sentiment.score), ('magnitude', sentiment.magnitude)] for k, v in results: print('{:10}: {}'.format(k, v)) Python package: pypi.org/project/google-cloud-language Tutorial code: codelabs.developers.google.com/codelabs/cloud-natural-language-python3
  25. 38.

    @PicardParis Benefits Translate Many Languages 100+ different languages, from Afrikaans

    to Zulu. Used in combination, this enables translation between thousands of language pairs. Language Detection Translation API can automatically identify languages with high accuracy. Simple Integration Easy to use Google REST API. No need to extract text from your document, just send it HTML documents and get back translated text. High Quality Translations High quality translations that push the boundary of Machine Translation. Updated constantly to seamlessly improve translations and introduce new languages and language pairs.
  26. 39.

    @PicardParis OSS client libraries from google.cloud import translate def translate_text(target,

    text): """Translates text into the target language.""" translate_client = translate.Client() # Text can also be a sequence of strings, in which case this method # will return a sequence of results for each text. result = translate_client.translate(text, target_language=target) print('Text: {}'.format(result['input'])) print('Translation: {}'.format(result['translatedText'])) print('Detected source language: {}'.format(result['detectedSourceLanguage'])) Sample from Python open source client library github.com/GoogleCloudPlatform/python-docs-samples
  27. 41.

    @PicardParis Benefits Speech Recognition Recognizes 120 languages & variants. Powered

    by deep learning neural networking to power your applications. Real-Time Results Can stream text results, returning partial recognition results as they become available. Can also be run on buffered or archived audio files. Noise Robustness No need for signal processing or noise cancellation before calling API. Can handle noisy audio from a variety of environments. More... Customized recognition (BETA) Language auto-detection Speaker diarization Auto punctuation ...
  28. 42.

    @PicardParis Speech timestamps "transcript": "Hello Python Pizza…", "confidence": 0.96596134, "words":

    [ { "startTime": "1.400s", "endTime": "1.800s", "word": "Hello" }, { "startTime": "1.800s", "endTime": "2.300s", "word": "Python" }, … ]
  29. 43.

    @PicardParis OSS client libraries from google.cloud import speech_v1 as speech

    def speech_to_text(config, audio): client = speech.SpeechClient() response = client.recognize(config, audio) config = {'language_code': 'fr-FR', 'enable_automatic_punctuation': True, 'enable_word_time_offsets': True} audio = {'uri': 'gs://cloud-samples-data/speech/corbeau_renard.flac'} speech_to_text(config, audio) """ Transcript: Maître corbeau sur un arbre perché tenait en son bec un fromage... Confidence: 93% """ Python package: pypi.org/project/google-cloud-speech Tutorial code: codelabs.developers.google.com/codelabs/cloud-speech-text-python3
  30. 48.

    @PicardParis OSS client libraries from google.cloud import texttospeech from google.cloud.texttospeech

    import enums, types def text_to_wav(voice_name, text): language_code = '-'.join(voice_name.split('-')[:2]) input = types.SynthesisInput(text=text) voice = types.VoiceSelectionParams(language_code=language_code, name=voice_name) audio_config = types.AudioConfig(audio_encoding=enums.AudioEncoding.LINEAR16) client = texttospeech.TextToSpeechClient() response = client.synthesize_speech(input, voice, audio_config) save_to_wav(f'{language_code}.wav', response.audio_content) text_to_wav('en-AU-Wavenet-A', 'What is the temperature in Sydney?') text_to_wav('en-GB-Wavenet-B', 'What is the temperature in London?') text_to_wav('en-IN-Wavenet-C', 'What is the temperature in Delhi?') Python package: pypi.org/project/google-cloud-texttospeech Tutorial code: codelabs.developers.google.com/codelabs/cloud-text-speech-python3
  31. 52.

    @PicardParis Cloud AutoML Train Deploy Serve Your training data Your

    custom model with a REST API Your custom edge model TF Lite mobile TF.js browser Container anywhere
  32. 57.

    @PicardParis Auto-generate a custom model from your data AutoML Vision

    AutoML Natural Language AutoML Translation AutoML Video Intelligence AutoML Tables Image Text Text Video Structured Data Custom - Classification - Object Detection Custom Classification Custom - Classification - Entity Extraction - Sentiment Analysis Custom Translation Custom - Classification - Metrics Prediction
  33. 58.

    @PicardParis What are your emotions? Ready-to-use model Joy Surprise Sorrow

    Anger I want to detect faces + general emotions Custom model Tongue out Yawning Sleeping I want to detect new custom expressions
  34. 59.

    Stache Club demo serverless architecture Source Selfies Stache Club App

    Stache Club Selfies Face Detection Custom Detection Selfie Processing User 1 2 3 4 Admin
  35. 61.

    @PicardParis Evaluation: results vs expectations Results returned by model Results

    we expect Results we don't expect Results not returned by model Model positives ← Model negatives ←
  36. 62.

    @PicardParis Model precision Precision = Precision can be seen as

    a measure of exactness or quality. High precision means that the model returns substantially more expected results than unexpected ones.
  37. 63.

    @PicardParis Model recall Recall can be seen as a measure

    of completeness or quantity. High recall means that the model returns most of the expected results. Recall =
  38. 64.

    @PicardParis Transfer learning Build on existing models AutoML under the

    hood ML for ML Models to identify optimal model architectures Hyperparameter auto-tuning Algorithm for finding the best hyperparameters for your model & data
  39. 65.

    @PicardParis Updated output using your training data Transfer learning Model

    trained on a lot of data Your data Hidden layers
  40. 66.

    @PicardParis ML for ML: finding the optimal architecture Controller: proposes

    ML models Train & evaluate models 20K times Iterate to find the most accurate model Layers Learning rate Neural Architecture Search paper: bit.ly/nas-paper
  41. 67.

    @PicardParis Hyperparameter tuning • Hyperparameters: any value which affects the

    accuracy of an algorithm, but is not directly learned by it • HyperTune: Google-developed algorithm to find the best hyperparameter combinations for your model HyperParam #1 Objective Want to find this Not these HyperParam #2
  42. 71.

    @PicardParis How can I build smarter solutions? hours days weeks

    Time? none dataset dataset + NN + … Difficulty? ML APIs ML AutoML ML expertise Developer skills
  43. 72.

    Resources Ready-to-use machine learning models Train a custom model with

    your own data Build your own model from scratch