Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Now smile! (also in 3D): Exploring AR, ML and Camera-related APIs on Android

Now smile! (also in 3D): Exploring AR, ML and Camera-related APIs on Android

When it comes to Camera APIs, Android has come a long way since its start, which is good, not only for the platform consistency above Lollipop but also because now we can increase its potential by using AR and ML APIs integrations, such as AR Core, ML Kit and MediaPipe.

However, it can bit confusing if you're not familiar with the existing available APIs, which one to use for specific scenarios and how to approach Camera2, and CameraX in general.

During this talk, you will learn the basics of camera APIs on Android and how to extend them to different use cases to enhance your app use experience!


This talk was presented during Droidcon London 2023, DevFest Algarve 2023 and the December edition of the Dutch Android User Group meetup!

You can find a video of the Droidcon London 2023 presentation here:

Walmyr Carvalho

December 06, 2023

More Decks by Walmyr Carvalho

Other Decks in Technology


  1. Exploring AR, ML and Camera-related APIs on Android Now smile

    (also in 3D)! 🤖📸✨ Walmyr Carvalho Lead Android Engineer @ Polaroid, Google Developer Expert for Android
  2. It’s such a powerful integration to have in our apps,

    so we should be mindful on what we’re trying to build.
  3. Although they’re connected devices, Polaroid cameras are analog by design.

    But we do have some cool app features using custom Camera APIs, like photo scanning with frame recognition and custom AR pictures! 🪄🦄 Photo Scanner
  4. If you’re planning to have only basic photo and video

    functionality on your app, you should just call the regular system Intents for it!
  5. // Photo capture intent val REQUEST_IMAGE_CAPTURE = 1 private fun

    dispatchTakePictureIntent() { val takePictureIntent = Intent(MediaStore.ACTION_IMAGE_CAPTURE) try { startActivityForResult(takePictureIntent, REQUEST_IMAGE_CAPTURE) } catch (e: ActivityNotFoundException) { // Handle error } }
  6. // Video capture intent val REQUEST_VIDEO_CAPTURE = 1 private fun

    dispatchTakeVideoIntent() { Intent(MediaStore.ACTION_VIDEO_CAPTURE).also { takeVideoIntent -> takeVideoIntent.resolveActivity(packageManager)?.also { startActivityForResult(takeVideoIntent, REQUEST_VIDEO_CAPTURE) } ?: run { // Handle error } } }
  7. For a more custom usage of Camera APIs, today we

    have three options: CameraX, Camera2 and Camera.
  8. For a more custom usage of Camera APIs, today we

    have three (ish) options: CameraX, Camera2 and Camera (deprecated).
  9. dependencies { def camerax_version = "1.4.0" implementation "androidx.camera:camera-core:${camerax_version}" implementation "androidx.camera:camera-camera2:${camerax_version}"

    implementation "androidx.camera:camera-lifecycle:${camerax_version}" implementation "androidx.camera:camera-video:${camerax_version}" implementation "androidx.camera:camera-view:${camerax_version}" implementation "androidx.camera:camera-extensions:${camerax_version}" }
  10. // Permissions required (define permissions in way you prefer) <uses-feature

    android:name="android.hardware.camera.any" /> <uses-permission android:name="android.permission.CAMERA" /> <uses-permission android:name="android.permission.RECORD_AUDIO" /> <uses-permission android:name=“android.permission.WRITE_EXTERNAL_STORAGE" />
  11. // Setup a ProcessCameraProvider cameraProvider = ProcessCameraProvider.getInstance(requireContext()).await() // Select lensFacing

    depending on the available cameras lensFacing = when { hasBackCamera() -> CameraSelector.LENS_FACING_BACK hasFrontCamera() -> CameraSelector.LENS_FACING_FRONT else -> throw IllegalStateException("Back and front camera are unavailable") } // If you want to change the different cameras, you need a CameraSelector val cameraSelector = CameraSelector.Builder().requireLensFacing(lensFacing).build()
  12. // Preview preview = Preview.Builder() .setTargetAspectRatio(screenAspectRatio) .setTargetRotation(rotation) .build() // ImageCapture

    imageCapture = ImageCapture.Builder() .setCaptureMode(ImageCapture.CAPTURE_MODE_MINIMIZE_LATENCY) .setTargetAspectRatio(screenAspectRatio) .setTargetRotation(rotation) .build()
  13. try { // Unbind use cases before rebinding cameraProvider.unbindAll() //

    Bind use cases to camera cameraProvider.bindToLifecycle( this, cameraSelector, preview, imageCapture) } catch(exception: Exception) { Log.e(TAG, "Use case binding failed", exception) }
  14. // Get a stable reference of the modifiable image capture

    use case val imageCapture = imageCapture ?: return // Create time stamped name and MediaStore entry. val name = SimpleDateFormat(FILENAME_FORMAT, Locale.US) .format(System.currentTimeMillis()) val contentValues = ContentValues().apply { put(MediaStore.MediaColumns.DISPLAY_NAME, name) put(MediaStore.MediaColumns.MIME_TYPE, "image/jpeg") if(Build.VERSION.SDK_INT > Build.VERSION_CODES.P) { put(MediaStore.Images.Media.RELATIVE_PATH, "Pictures/CameraX-Image") } }
  15. // Create output options object which contains file + metadata

    val outputOptions = ImageCapture.OutputFileOptions .Builder(contentResolver, MediaStore.Images.Media.EXTERNAL_CONTENT_URI, contentValues) .build()
  16. // Set up image capture listener to get the photo

    action callback imageCapture.takePicture( outputOptions, ContextCompat.getMainExecutor(this), object : ImageCapture.OnImageSavedCallback { override fun onError(exc: ImageCaptureException) { Log.e(TAG, "Photo capture failed: ${exc.message}", exc) } override fun onImageSaved(output: ImageCapture.OutputFileResults){ val msg = "Photo capture succeeded: ${output.savedUri}" Toast.makeText(baseContext, msg, Toast.LENGTH_SHORT).show() Log.d(TAG, msg) } } )
  17. These examples and more are available at the official code

    lab repo: developer.android.com/codelabs/camerax-getting-started
  18. CameraX support many other use cases, such as VideoCapture, ImageAnalysis,

    dual concurrent camera and more, please try it out! 🙏
  19. "Augmented reality (AR) is an interactive experience that combines the

    real world and computer-generated content. The content can span multiple sensory modalities, including visual, auditory, haptic, somatosensory and olfactory." - Wikipedia
  20. OK, so how is the best way to make AR

    experiences on Android? 🤔
  21. AR Core tools & capabilities • Motion tracking for relative

    positioning to your surroundings; • Environmental awareness for anchor and object placing support; • Depth awareness for tracking and distance measurements; • Lighting awareness for illumination checks and color corrections; • Unity and Adobe Aero support; • Many more! Source: Google AR Core
  22. ML Kit overview • Machine Learning common tools, but optimised

    for mobile usage • Easy-to-implement Vision APIs, such as text, pose and face detection, image labelling, barcode scanning • Natural language functionalities, such as language identification, translation and more; Source: Google Codelabs (mlkit-android)
  23. MediaPipe overview • Cross-platform (mobile, web, desktop) solution for common

    ML tasks; • Empowers some of the ML Kit functionalities under the hood; • More powerful tools to create custom ML use cases; • More applied to real-time perception tasks in a low-latency experience; • Low-code APIs to create custom ML solutions Source: MediaPipe
  24. TensorFlow Lite overview • Really powerful tool to create and

    train deep learning ML models; • High-level APIs integration, like Keras; • Android, iOS, embedded Linux and microcontroller support; • Diverse language support (Java/Kotlin, Swift, Objective-C, C++ and Python); • Powerful tooling (like monitoring, deployment, etc) powered by its big brother, Tensorflow; Source: TensorFlow Lite (Google)
  25. Camera hardware + preview 🤳 AR Environment Renderer Google Cloud

    Vision API Image Classification Model ML Kit Object Detection + TensorFlow Lite offline model +
  26. Key takes to bring home 1.CameraX is the way to

    go when it comes to use camera (and its customisations) on Android; 2.ML Kit, AR Core and TensorFlow Lite are really a powerful combination; 3.Cloud Vision APIs can actually extend the capabilities of your project but fetching larger, up to date models. 4.You can use Camera APIS, AR and ML to any purposes, being it either a game or a tool to help people engage with your business! 5.There’s plenty of content available to learn and explore, especially Codelabs; 6.Have fun and… smile! 👋 📸
  27. Walmyr Carvalho Lead Android Engineer @ Polaroid, Google Developer Expert

    for Android Thank you so much! ❤ The slides, links and sources will be available soon, please reach out on Twitter!