Save 37% off PRO during our Black Friday Sale! »

Let’s start ML on GCP with AutoML

Let’s start ML on GCP with AutoML

923237754890d536819892ad42451555?s=128

sakajunquality

March 23, 2019
Tweet

Transcript

  1. Let’s start ML on GCP with AutoML @sakajunquality 19.03.23 #gcpug

    #機械学習名古屋
  2. - @sakajunquality - Jun Sakata - GDE, Cloud - SRE

    @ Ubie, inc. - Usually… - #GKE #Kubernetes #containers #DevOps etc. - Not ML Person... Who am I?
  3. Agenda - AutoML - Kubernetes Docs Translation with AutoML Translate

    - Points for ML on GCP
  4. AutoML

  5. AutoML - State-of-the-art performance - Get up and running fast

    - Generate high-quality training data (from the official website...)
  6. AutoML https://cloud.google.com/automl/

  7. AutoML - Prepare the data - Train - Evaluate -

    Use as API
  8. AutoML - Vision - Natural Languages - Translation

  9. AutoML - Vision - Natural Languages - Translation

  10. Kubernetes Docs Translation JA with AutoML Translation

  11. About AutoML Translation - Create “domain specific” translation model -

    Over 100 languages
  12. #kubernetes-docs-ja - A community translation project of Kubernetes Docs into

    Japanese - https://kubernetes.io/ja/docs/home/ - Slack - http://slack.k8s.io/ - #kubernetes-docs-ja channel
  13. #kubernetes-docs-ja https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/

  14. #kubernetes-docs-ja https://kubernetes.io/ja/docs/concepts/overview/what-is-kubernetes/

  15. Motivation - Active developments and releases in Kubernetes - Documents

    are also frequently updated
  16. #kubernetes-docs-ja

  17. Prepare the dataset - Use the already translated Japanese and

    original English Translated sentence pairs (en/ja) K8s specific translation Model
  18. Prepare the dataset

  19. Prepare the dataset - Use the already translated Japanese and

    original English - Some amendments - e.g. - Make 1:1 pairs of sentences - Use the same terms
  20. Prepare the dataset

  21. Prepare the dataset Not enough sentences...

  22. Official Document: Preparing Training Data https://cloud.google.com/translate/automl/docs/prepare?hl=en

  23. Prepare the dataset - At least 100 sentences each for

    - Train - Validation - Test
  24. Prepare the dataset - Not enough sentences yet in the

    project - Use some sentences from GKE docs - https://cloud.google.com/kubernetes-engine/docs/ - Topic and terms are quite similar - With some amendments in terms
  25. Prepare the dataset Change of plan... Kubernetes Docs Custom Model

    GKE Docs
  26. Prepare the dataset - Export sentences pairs as TSV -

    And upload into AutoML Translation
  27. Crate the dataset Chose languages...

  28. Crate the dataset Need to upload separately with few data

  29. Prepare the dataset

  30. Train - Just Click “START TRAINING” - Can use base

    model - Google NMT (Default) - https://ai.google/research/pubs/pub45610 - Other AutoML model
  31. Training...

  32. Wait for approximately 3 hours….

  33. Prediction - After training is finished, model can be used

    for prediction.
  34. Prediction

  35. Prediction Looks Good !

  36. Prediction Also Looks Good

  37. Prediction Some are not quite...

  38. API The model can be used via API

  39. Result - Result is available with scores - Refer to

    “Evaluating Model” - https://cloud.google.com/translate/automl/docs/evaluate
  40. Interpretation - Evaluation Scores https://cloud.google.com/translate/automl/docs/evaluate

  41. Training Result

  42. - Need more samples? - By default 10% of sentences

    are used for validation and test for each. - More datasets for training> - Model with datasets only from GKE is quite high in scores - Google’s official translation is better than community one? Considerations
  43. Points for ML on GCP

  44. Live Demo (if demanded...)

  45. Takeaways - You can start ML easily with AutoML -

    Creating Model - Serving Model - Some updates in Next SF ‘19 ?