Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Processing camera input on Android

Processing camera input on Android

Processing and analysing images from the camera in an Android application has always been a complicated thing to solve. Despite Google making a big overhaul of the camera APIs with camera2, developers still struggled with even the simplest tasks such as scanning QR-codes or doing text detection.

At Google IO 2019, we got a new library from Google to make our work easier. The new library is called CameraX and significantly simplifies the common tasks we have when working with the camera on Android. In this session, we will learn how to use the CameraX API for any kind of processing and computer vision. After this session, you’ll feel much more confident to implement the camera processing the way you wanted, without having to resort to heavy libraries that rely on the legacy APIs.

Erik Hellman

October 03, 2019
Tweet

More Decks by Erik Hellman

Other Decks in Programming

Transcript

  1. COMPUTER VISION ON ANDROID > ML Kit > ZXing >

    OpenCV > OpenGL ES > RenderScript > Play Services Vision API
  2. DEPRECATED APIS > ML Kit > ZXing > OpenCV >

    OpenGL ES > RenderScript > Play Services Vision API
  3. GRADLE DEPENDENCIES dependencies { // CameraX Core implementation "androidx.camera:camera-core:$camerax_version" //

    If you want to use Camera2 extensions implementation "androidx.camera:camera-camera2:$camerax_version" // Add Firebase ML Kit for easy computer vision functions! implementation "com.google.firebase:firebase-ml-vision:$firebaseVersion" }
  4. SETUP CAMERA PREVIEW val previewConfig = PreviewConfig.Builder() .setLensFacing(CameraX.LensFacing.BACK) .build() val

    preview = Preview(previewConfig) val viewFinder = findViewById<TextureView>(R.id.viewFinder) preview.setOnPreviewOutputUpdateListener { previewOutput -> viewFinder.surfaceTexture = previewOutput.surfaceTeexture } CameraX.bindToLifecycle(this, preview)
  5. SETUP ML KIT FOR BARCODE DETECTION val options = FirebaseVisionBarcodeDetectorOptions.Builder()

    .setBarcodeFormats(FirebaseVisionBarcode.FORMAT_ALL_FORMATS) .build() val detector = FirebaseVision.getInstance() .getVisionBarcodeDetector(options)
  6. SETUP CAMERAX IMAGE ANALYZER val imageAnalysisConfig = ImageAnalysisConfig.Builder() .setImageReaderMode(ImageAnalysis.ImageReaderMode.ACQUIRE_LATEST_IMAGE) .setTargetResolution(Size(1024,

    768)) .build() val imageAnalysis = ImageAnalysis(imageAnalysisConfig) imageAnalysis.setAnalyzer { image, rotationDegrees -> // TODO Perform image analysis... } CameraX.bindToLifecycle(this, imageAnalysis)
  7. ANALYSE IMAGES WITH ML KIT imageAnalysis.setAnalyzer { image, rotationDegrees ->

    val imageRotation = degreesToFirebaseRotation(rotationDegrees) image?.image?.let { val visionImage = FirebaseVisionImage.fromMediaImage(it, imageRotation) detector.detectInImage(visionImage) .addOnSuccessListener { barcodes -> displayDetectedBarcode(barcodes) } } }
  8. GOOGLE CAMERAX SAMPLES (SIMPLIFIED) val matrix = Matrix() val centerX

    = viewFinderSize.width / 2f val centerY = viewFinderSize.height / 2f val bufferRation = previewSize.height / previewSize.width.toFloat() val scaledHeight = viewFiderSize.width val scaledWidth = Math.roung(viewFinderSize.width * bufferRatio) val xScale = scaledWidth / viewFinderSize.width.toFloat() val yScale = scaledHight / viewFinderSize.height.toFloat() matrix.preScale(xScale, yScale, centerX, centerY)
  9. SCALE AND CROP! (SIMPLIFIED) val matrix = Matrix() val centerX

    = viewFinderSize.width / 2f val centerY = viewFinderSize.height / 2f val previewRatio = previewSize.width / previewSize.height.toFloat() val viewFinderRatio = viewFinderSize.width / viewFinderSize.height.toFloat() // Assume view finder is wider than its height matrix.postScale(1.0f, viewFinderRatio * previewRatio, centerX, centerY)
  10. −.4 −.3 −.2 −.1 +.1 +.2 +.3 +.4 −.1 +.1

    +.2 +.3 +.4 −.2 −.3 −.4 V U
  11. CONVERT YUV TO RGB (FROM WIKIPEDIA) void YUVImage::yuv2rgb(uint8_t yValue, uint8_t

    uValue, uint8_t vValue, uint8_t *r, uint8_t *g, uint8_t *b) const { int rTmp = yValue + (1.370705 * (vValue-128)); int gTmp = yValue - (0.698001 * (vValue-128)) - (0.337633 * (uValue-128)); int bTmp = yValue + (1.732446 * (uValue-128)); *r = clamp(rTmp, 0, 255); *g = clamp(gTmp, 0, 255); *b = clamp(bTmp, 0, 255); }
  12. EXTRACT GRAYSCALE PIXELS fun extractGrayscaleFromImage(image: Image): Pair<ByteBuffer, Size> { val

    size = Size(image.width, image.height) val yBuffer = image.planes[0].buffer val grayPixels = ByteBuffer.allocate(yBuffer.capacity()); grayPixels.put(yBuffer) grayPixels.flip() return grayPixels to size }
  13. CAMERAX ANALYZER MUST BE SYNCHRONOUS!2 imageAnalysis.setAnalyzer { image, rotationDegrees ->

    // This will crash - analysis must be synchronous! // See https://issuetracker.google.com/issues/139207716 GlobalScope.launch { analyzeImage(image); } } 2 Except when converting to FirebaseVisionImage!
  14. CAMERAX ANALYZER CAN'T USE IMAGEWRITER! val imageWriter = ImageWriter(renderScriptSurface, 1)

    imageAnalysis.setAnalyzer { image, rotationDegrees -> // This will crash - CameraX will call Image.close()! // See https://issuetracker.google.com/issues/139207716 val mediaImage = image?.image if (mediaImage != null) { imageWriter.queueInputImage(image.image) } }
  15. ANALYZE IMAGES IN PARALLEL // Thread pool with 5 threads

    val executor = Executors.newFixedThreadPool(5) val imageAnalysisConfig = ImageAnalysisConfig.Builder() .setImageReaderMode(ImageAnalysis.ImageReaderMode.ACQUIRE_NEXT_IMAGE) .setImageQueueDepth(5) .setTargetResolution(Size(1024, 768)) .build() imageAnalysis.setAnalyzer { image, rotationDegrees -> // Analysis can run in parallel } CameraX.bindToLifecycle(this, imageAnalysis)
  16. AUTO-DOWNLOAD REQUIRED ML MODELS (OPTIONAL) <application ...> ... <meta-data android:name="com.google.firebase.ml.vision.DEPENDENCIES"

    android:value="ocr" /> <!-- To use multiple models: android:value="ocr,model2,model3" --> </application>
  17. CONVERT ROTATION DEGREES TO FIREBASE ROTATION fun degreesToFirebaseRotation(degrees: Int): Int

    = when(degrees) { 0 -> FirebaseVisionImageMetadata.ROTATION_0 90 -> FirebaseVisionImageMetadata.ROTATION_90 180 -> FirebaseVisionImageMetadata.ROTATION_180 270 -> FirebaseVisionImageMetadata.ROTATION_270 else -> throw Exception("Rotation must be 0, 90, 180, or 270.") }
  18. TEXT RECOGNITION val detector = FirebaseVision.getInstance().onDeviceTextRecognizer val image = FirebaseVisionImage.fromMediaImage(mediaImage,

    rotation) val result = detector.processImage(image) .addOnSuccessListener { resultText -> // Task completed successfully } .addOnFailureListener { // Task failed with an exception }
  19. TEXT RECOGNITION val resultText = result.text for (block in result.textBlocks)

    { val blockText = block.text val blockConfidence = block.confidence val blockLanguages = block.recognizedLanguages val blockCornerPoints = block.cornerPoints val blockFrame = block.boundingBox for (line in block.lines) { val lineText = line.text val lineConfidence = line.confidence val lineLanguages = line.recognizedLanguages val lineCornerPoints = line.cornerPoints val lineFrame = line.boundingBox for (element in line.elements) { val elementText = element.text val elementConfidence = element.confidence val elementLanguages = element.recognizedLanguages val elementCornerPoints = element.cornerPoints val elementFrame = element.boundingBox } } }
  20. CONFIGURE FIREBASE-HOSTED MODEL val conditions = FirebaseModelDownloadConditions.Builder() .requireWifi() .build() val

    remoteModel = FirebaseRemoteModel.Builder("my_remote_model") .enableModelUpdates(true) .setInitialDownloadConditions(conditions) .setUpdatesDownloadConditions(conditions) .build() FirebaseModelManager.getInstance().registerRemoteModel(remoteModel)
  21. PROCESS IMAGE labeler.processImage(image) .addOnSuccessListener { labels -> for (label in

    labels) { val text = label.text val confidence = label.confidence } }
  22. CONCLUSIONS > CameraX is still in alpha, but ready to

    use today > CameraX Analysis must be synchronous > View finder transform is tricky > ML Kit covers most use cases