Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning in Java: Die Visual Recognitio...

Machine Learning in Java: Die Visual Recognition API (JSR381) DWX 2021

Machine Learning ist im Mainstream angekommen. Daher wurde im Rahmen des Java Community Process (JCP) im Java Specification Request (JSR) 381 auch ein standardisiertes Set an APIs zum Klassifizieren und Erkennen von Objekten in Bildern verabschiedet.

In diesem Talk werden Umsetzungsmöglichkeiten anhand von Beispielen aufgezeigt und auch mögliche Alternativen beleuchtet.

Dennis Kieselhorst

June 30, 2021
Tweet

More Decks by Dennis Kieselhorst

Other Decks in Programming

Transcript

  1. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Dennis Kieselhorst, Sr. Solutions Architect Machine Learning in Java: Die Visual Recognition API Java Specification Request (JSR) 381
  2. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Table of contents • Intro Machine Learning, Artificial Intelligence and Deep Learning • Visual recognition • Why in Java? • Java Specification Request (JSR) 381 • Implementations • Code samples • Alternatives
  3. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. The reach of ML is growing By 2024, global spending on artificial intelligence will reach $110 billion —IDC INCREASED SPENDING By the end of 2024, 75% of enterprises will shift from piloting to operationalizing AI —Gartner FROM PILOTING TO OPERATIONALIZING AI TRANSFORMATION 57% said that AI would transform their organization in the next three years —Deloitte
  4. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Artificial Intelligence, Machine and Deep learning Artificial intelligence Subset of AI that uses machines to search for patterns in data to build logic models automatically Subset of ML composed of deeply multi-layered neural networks that perform tasks like speech and image recognition
  5. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Visual recognition
  6. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Source: https://github.com/awslabs/djl/blob/master/examples/src/test/resources/dog-cat.jpg
  7. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Source: https://github.com/awslabs/djl/blob/master/examples/docs/img/cat_dog_detected.jpg
  8. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Source: https://github.com/awslabs/djl/blob/master/examples/docs/img/detected-dogs.jpg
  9. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Source: https://github.com/awslabs/djl/blob/master/examples/docs/img/detected-dog_bike_car.png
  10. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Why in Java? • Looking at indexes (like TIOBE and PYPL) Java is trending a bit down but still one of the most popular languages • Great language with a huge ecosystem with millions of developers • Existing APIs weren‘t Java-friendly (complex, „C flavor“) Source: https://wiki.openjdk.java.net/display/duke/Gallery
  11. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Java Specification Request (JSR) 381: Goals Goals from the Java Visual Recognition Specification doc (version 1.0): • Describe a standard, easy-to-use and flexible set of high-level APIs • Offer high-level abstractions for sustainable development of ML products and services • Have well-defined APIs essential for robust system architecture • Offer ease of development and portability • Provide thorough information to help create alternative implementations • Offer the ability to create custom Classifiers in addition to pre-trained Classifiers Source: https://jcp.org/en/jsr/detail?id=381
  12. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Java Specification Request (JSR) 381: Architecture Source: https://jcp.org/en/jsr/detail?id=381
  13. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Implementations • Deep Netts (reference implementation) • DeepJavaLibrary (DJL) • Experimental: • Deeplearning4j • Open Intelligent Multimedia Analysis for Java (OpenIMAJ) • IBM Watson
  14. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Code to train a model ImageClassifier<BufferedImage> classifier = NeuralNetImageClassifier.builder() .inputClass(BufferedImage.class) .imageHeight(128).imageWidth(128) .labelsFile(new File("labels.txt")) .trainingFile(new File("train.txt")) .networkArchitecture(new File("arch.json")) .exportModel(Paths.get("trained_model.dnet")) .maxError(0.03f) .maxEpochs(1000) .learningRate(0.01f) .build(); BufferedImage image = ImageIO.read(new File("another_cat.png")); Map<String, Float> results = classifier.classify(image);
  15. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Confidence scores CONFIDENCE THRESHOLD Confidence scores let you choose the best tradeoff for your use case between precision and recall
  16. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Code to use an existing model ImageClassifier<BufferedImage> classifier = NeuralNetImageClassifier.builder() .inputClass(BufferedImage.class) .imageHeight(128).imageWidth(128) .importModel(Paths.get("trained_model.dnet")) .build(); BufferedImage image = ImageIO.read(new File("another_cat.png")); Map<String, Float> results = classifier.classify(image); More sample code: https://github.com/JavaVisRec/visrec-api/wiki/Getting-Started-Guide#Examples
  17. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. What about Spring Boot? • Spring Boot Starter project for DJL Deep Java Library, not specifically for JSR 381 • Simplifies dependency management • Allows auto configuration for use cases like object detection (see application.yml sample) djl: # Define application type application-type: OBJECT_DETECTION # Define input data type, a model may accept multiple input data type input-class: java.awt.image.BufferedImage # Define output data type, a model may generate different out put output-class: ai.djl.modality.cv.output.DetectedObjects arguments: threshold: 0.5 # Display all results with probability of 0.5 and above Blog post: https://aws.amazon.com/blogs/opensource/adopting-machine- learning-in-your-microservices-with-djl-deep-java-library-and-spring-boot/
  18. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon SageMaker Label data Aggregate & prepare data Store & share features Auto ML Spark/R Detect bias Visualize in notebooks Pick algorithm Train models Tune parameters Debug & profile Deploy in production Manage & monitor CI/CD Human review Ground Truth NEW Data Wrangler NEW Feature store Autopilot Processing NEW Clarify Studio Notebooks Built-in or Bring-your-own NEW Experiments Spot Training Distributed Training Automatic Model Tuning Debugger NEW Model Hosting Multi-model Endpoints Model Monitor NEW Pipelines Augmented AI NEW: AMAZON SAGEMAKER EDGE MANAGER SAGEMAKER STUDIO IDE AMAZON SAGEMAKER JUMPSTART VISION SPEECH TEXT SEARCH CHATBOTS PERSONALIZATION FORECASTING FRAUD CONTACT CENTERS Deep Learning AMIs & Containers GPUs & CPUs Elastic Inference Trainium Inferentia FPGA AI SERVICES ML SERVICES FRAMEWORKS & INFRASTRUCTURE DeepGraphLibrary Amazon Rekognition Amazon Polly Amazon Transcribe +Medical Amazon Lex Amazon Personalize Amazon Forecast Amazon Comprehend +Medical Amazon Textract Amazon Kendra Amazon CodeGuru Amazon Fraud Detector Amazon Translate INDUSTRIAL AI CODE AND DEVOPS NEW Amazon DevOps Guru Voice ID For Amazon Connect Contact Lens NEW Amazon Monitron NEW AWS Panorama + Appliance NEW Amazon Lookout for Vision NEW Amazon Lookout for Equipment The AWS ML Stack Broadest and most complete set of machine learning capabilities NEW Amazon HealthLake HEALTHCARE AI NEW Amazon Lookout for Metrics ANOMOLY DETECTION Amazon Transcribe for Medical Amazon Comprehend for Medical
  19. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. The Amazon ML stack: Broadest & deepest set of capabilities ML FRAMEWORKS & INFRASTRUCTURE AI SERVICES Vision | Documents | Speech | Language | Chatbots | Forecasting | Recommendations | Fraud detection | Enterprise Search | Code Review ML SERVICES Data labeling | Pre-built algorithms & notebooks | One-click training and deployment Build, train, and deploy machine learning models fast Easily add intelligence to applications without machine learning skills Flexibility & choice, highest-performing infrastructure Support for ML frameworks | Compute options purpose-built for ML
  20. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon SageMaker Label data Aggregate & prepare data Store & share features Auto ML Spark/R Detect bias Visualize in notebooks Pick algorithm Train models Tune parameters Debug & profile Deploy in production Manage & monitor CI/CD Human review Ground Truth NEW Data Wrangler NEW Feature store Autopilot Processing NEW Clarify Studio Notebooks Built-in or Bring-your-own NEW Experiments Spot Training Distributed Training Automatic Model Tuning Debugger NEW Model Hosting Multi-model Endpoints Model Monitor NEW Pipelines Augmented AI NEW: AMAZON SAGEMAKER EDGE MANAGER SAGEMAKER STUDIO IDE AMAZON SAGEMAKER JUMPSTART VISION SPEECH TEXT SEARCH CHATBOTS PERSONALIZATION FORECASTING FRAUD CONTACT CENTERS Deep Learning AMIs & Containers GPUs & CPUs Elastic Inference Trainium Inferentia FPGA AI SERVICES ML SERVICES FRAMEWORKS & INFRASTRUCTURE DeepGraphLibrary Amazon Rekognition Amazon Polly Amazon Transcribe +Medical Amazon Lex Amazon Personalize Amazon Forecast Amazon Comprehend +Medical Amazon Textract Amazon Kendra Amazon CodeGuru Amazon Fraud Detector Amazon Translate INDUSTRIAL AI CODE AND DEVOPS NEW Amazon DevOps Guru Voice ID For Amazon Connect Contact Lens NEW Amazon Monitron NEW AWS Panorama + Appliance NEW Amazon Lookout for Vision NEW Amazon Lookout for Equipment The AWS ML Stack: Amazon Rekognition Automate your image and video analysis using machine learning NEW Amazon HealthLake HEALTHCARE AI NEW Amazon Lookout for Metrics ANOMOLY DETECTION Amazon Transcribe for Medical Amazon Comprehend for Medical
  21. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Labels (object, scenes, and activities) Face detection and analysis Text in image Unsafe image and video detection Pathing Face search Real-time video analysis Celebrity recognition Amazon Rekognition Image and video analysis
  22. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Rekognition Deep-Learning-Based Image and Video Analysis
  23. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Code (with AWS SDK for Java v2) API: https://docs.aws.amazon.com/rekognition/latest/dg/API_DetectLabels.html SDK code: https://docs.aws.amazon.com/code-samples/latest/catalog/javav2-rekognition-src- main-java-com-example-rekognition-DetectLabels.java.html <dependencyManagement> <dependencies> <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>bom</artifactId> <version>2.16.90</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> <dependencies> <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>rekognition</artifactId> </dependency> RekognitionClient rekognitionClient = RekognitionClient.builder() .region(Region.EU_CENTRAL_1) .build(); InputStream sourceStream = new FileInputStream( new File(sourceImageName)); SdkBytes sourceBytes = SdkBytes.fromInputStream(sourceStream); Image sourceImage = Image.builder() .bytes(sourceBytes) .build(); DetectLabelsRequest detectLabelsRequest = DetectLabelsRequest.builder() .image(sourceImage) .maxLabels(10) .build(); DetectLabelsResponse labelsResponse = rekognitionClient.detectLabels(detectLabelsRequest);
  24. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Blog post: https://aws.amazon.com/blogs/machine-learning/automatically-detecting-personal- protective-equipment-on-persons-in-images-using-amazon-rekognition/ API: https://docs.aws.amazon.com/rekognition/latest/dg/API_DetectProtectiveEquipment.html
  25. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Ingest Store Analyze Deliver Petabytes of images and video assets Centralized storage and global registry Metadata enrichment through deep learning Enhanced value and search experience FEEDBACK LOOP Augment Utilize humans to perform validation and tasks ML cannot yet do, include 3rd party datasets from Amazon Data Exchange Another use case: Media enrichment https://aws.amazon.com/machine-learning/ml-use-cases/media-intelligence/
  26. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Summary • Lots of potential use cases for visual recognition • No need to be a Data Scientist to use it • The current JSR-381 spec is a starting point but needs further refinement and evolution as stated in the final notes of the first release: • The long term vision of this JSR and specification is to provide a standard API to build machine learning applications using Java. • One of our major goals is to allow Java application developers to incorporate machine learning features without necessarily becoming experts in machine learning. • Please try it out and provide feedback using the GitHub issue tracker https://github.com/JavaVisRec or the mailing list https://groups.io/g/visrec Source: https://jcp.org/en/jsr/detail?id=381
  27. © 2021, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Recommend to read: Dive into Deep Learning (D2L) • 150+ runnable Jupyter Notebooks from model architectures to applications (CV, NLP, etc.) • Adopted as a textbook or reference book at UC Berkeley, CMU, MIT, and 70+ universities worldwide • Wide theoretical coverage: statistics, optimization, machine learning basics, GPU parallel training, etc. Dive Into Deep Learning is an excellent text on deep learning and deserves attention from anyone who wants to learn why deep learning has ignited the AI revolution – the most powerful technology force of our time. --- Jensen Huang, CEO of NVIDIA https://d2l.djl.ai