Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Revolutionizing ANR Detection at Sentry

Revolutionizing ANR Detection at Sentry

In this talk, we'll dive into the ANR (Application Not Responding) detection mechanism of the Sentry Android SDK. We'll compare the existing approaches, such as watchdog and native signal handler, with the new ApplicationExitInfo API available from Android 11 onwards.

We'll explore the challenges faced while building the new implementation, such as enriching ANRs with data from the previous app run, making sure the previous app session is finished properly and parsing ANR thread dumps into backend-friendly formats.

Roman Zavarnitsyn

July 06, 2023
Tweet

More Decks by Roman Zavarnitsyn

Other Decks in Programming

Transcript

  1. What is Sentry? getsentry/sentry @getsentry sentry.io • Application Monitoring in

    Production • Developer Tool by devs for devs • Open-Source • Free*
  2. ANR Types • Input dispatching timeout (Activity): app doesn’t respond

    to an input event for 5 seconds • Executing Service: • Foreground: not calling startForeground() for 10 seconds • Regular: not fi nishing onCreate() / onStartCommand() / onBind() within 20 seconds • Broadcast Receiver: not fi nishing onReceive() within 10 seconds
  3. ANR Types • Input dispatching timeout (Activity): app doesn’t respond

    to an input event for 5 seconds • Executing Service: • Foreground: not calling startForeground() for 10 seconds • Regular: not fi nishing onCreate() / onStartCommand() / onBind() within 20 seconds • Broadcast Receiver: not fi nishing onReceive() within 10 seconds
  4. User-perceived ANR Main Thread Blocked User Input 1s 2s 3s

    4s 5s ANR! Application.onAnrDetected()
  5. ANR Detection Mechanisms • Watchdog thread: periodically send a task

    to the main thread and check if it gets executed within 5 seconds • Native Signal Handler: install native SIGQUIT handler and wire it over JNI • ApplicationExitInfo: use system API to retrieve ANRs on next app launch
  6. Watchdog Pros Cons • Almost real-time reporting • Ability to

    enrich ANR events with current state of the device (battery level, available memory, screenshot, etc.) • Available on all Android versions • A lot of false positives • No Service or BroadcastReceiver ANRs detection • Little thread info (no deadlocks detection)
  7. Native Signal Handler Pros Cons • Almost real-time reporting •

    Ability to enrich ANR events with current state of the device (battery level, available memory, screenshot, etc.) • Available on all Android versions • False positives • No background ANR detection • Little thread info (no deadlocks detection)
  8. ApplicationExitInfo Pros Cons • Most accurate ANR detection coming from

    the OS • Full stacktrace and thread information (with deadlock detection) • Background/Foreground ANRs • Only available on Android 11 and above • Not possible to access device dynamic data retroactively (battery, current memory, etc.)
  9. Sentry ANR Detection V1 (Watchdog) ANR! Watchdog triggered Collect dynamic

    data Collect static data Send ANR Event Terminate process (or not)
  10. Sentry ANR Detection V2 (AppExitInfo) ANR! Retrieve dynamic data from

    disk Collect static data Send ANR Event (async) Terminate process Continuously persist dynamic data to disk
  11. New ANR detection internals val appExitInfoList = activityManager.getHistoricalExitReasons() for (exitInfo

    in appExitInfoList) { // look for ANRs if (exitInfo.reason == ApplicationExitInfo.REASON_ANR) { // do not report duplicate events if (exitInfo.timestamp > lastReportedAnr) { val event = SentryEvent().apply { level = FATAL timestamp = exitInfo.timestamp // enrich with contexts,breadcrumbs,etc. } Sentry.captureEvent(event) } } }
  12. New ANR detection internals val appExitInfoList = activityManager.getHistoricalExitReasons() for (exitInfo

    in appExitInfoList) { // look for ANRs if (exitInfo.reason == ApplicationExitInfo.REASON_ANR) { // do not report duplicate events if (exitInfo.timestamp > lastReportedAnr) { val event = SentryEvent().apply { level = FATAL timestamp = exitInfo.timestamp // enrich with contexts,breadcrumbs,etc. } Sentry.captureEvent(event) } } }
  13. New ANR detection internals val appExitInfoList = activityManager.getHistoricalExitReasons() for (exitInfo

    in appExitInfoList) { // look for ANRs if (exitInfo.reason == ApplicationExitInfo.REASON_ANR) { // do not report duplicate events if (exitInfo.timestamp > lastReportedAnr) { val event = SentryEvent().apply { level = FATAL timestamp = exitInfo.timestamp // enrich with contexts,breadcrumbs,etc. } Sentry.captureEvent(event) } } }
  14. New ANR detection internals val appExitInfoList = activityManager.getHistoricalExitReasons() for (exitInfo

    in appExitInfoList) { // look for ANRs if (exitInfo.reason == ApplicationExitInfo.REASON_ANR) { // do not report duplicate events if (exitInfo.timestamp > lastReportedAnr) { val event = SentryEvent().apply { level = FATAL timestamp = exitInfo.timestamp // enrich with contexts,breadcrumbs,etc. } Sentry.captureEvent(event) } } }
  15. Preserving Contexts and Breadcrumbs V2 Sentry addBreadcrumb observer.addBreadcrumb addTag observer.addTag

    addContext observer.addContext ActivityIntegration TimberIntegration OkHttpIntegration Enrich ANR event Persist Breadcrumb Persist Tag Persist Context
  16. Preserving Contexts and Breadcrumbs val appExitInfoList = activityManager.getHistoricalExitReasons() for (exitInfo

    in appExitInfoList) { // look for ANRs if (exitInfo.reason == ApplicationExitInfo.REASON_ANR) { // do not report duplicate events if (exitInfo.timestamp > lastReportedAnr) { val event = SentryEvent().apply { level = FATAL timestamp = exitInfo.timestamp breadcrumbs = diskCache.breadcrumbs contexts = diskCache.contexts // enrich with threads and stacktraces } Sentry.captureEvent(event) } } }
  17. Preserving Contexts and Breadcrumbs val appExitInfoList = activityManager.getHistoricalExitReasons() for (exitInfo

    in appExitInfoList) { // look for ANRs if (exitInfo.reason == ApplicationExitInfo.REASON_ANR) { // do not report duplicate events if (exitInfo.timestamp > lastReportedAnr) { val event = SentryEvent().apply { level = FATAL timestamp = exitInfo.timestamp breadcrumbs = diskCache.breadcrumbs contexts = diskCache.contexts val systraceParser = SystraceParser() threads = systraceParser.parse(exitInfo.getTraceInputStream()) } Sentry.captureEvent(event) } } }
  18. What’s Next • Use Watchdog together with AppExitInfo • Can

    be useful if the main thread is blocked for less than 5s • Noti fi es if the user clicked “Wait” on ANR dialog and the app process continued • Gives more context data: • Screenshots • Current memory • Current battery
  19. What’s Next • Persist pro fi les across app launches

    • Know which methods exactly were called prior to ANR
  20. References • Sentry Android SDK - https://tinyurl.com/styjv • ApplicationExitInfo -

    https://developer.android.com/reference/android/app/ ApplicationExitInfo • ANRWatchdog - https://github.com/SalomonBrys/ANR-WatchDog