Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Resin | Application Server Health System | Java...

Avatar for billdigman billdigman
January 22, 2012

Resin | Application Server Health System | Java Monitoring

Because your site’s reliability is important, Resin monitors its internal sensor net every 60 seconds, recording your server’s memory, cpu, network, database and cluster status. Resin monitors JVM metrics, Java Application Servers metrics and OS metrics. The Resin health system saves the data so you can adjust the servers based on load, and can analyze problems after they occur. The data is saved so Resin can identify trends and anomalies.

Avatar for billdigman

billdigman

January 22, 2012
Tweet

Other Decks in Technology

Transcript

  1. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. caucho® , resin® and quercus® are registered trademarks of Caucho Technology, Inc. Resin Health System Beyond Java Monitoring and Server Monitoring Health Checks, Watchdog and Snapshot Report
  2. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. caucho® , resin® and quercus® are registered trademarks of Caucho Technology, Inc. Gartner names Caucho in "Cool Vendors in Platform and Integration Middleware" Java EE Certified
  3. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Resin  Health  System  (RHS)  Overview • Resin Health System (RHS) • Goes Beyond Just Monitoring Server and JVM • can respond to conditions with actions • Actions can remediate problems • If server about to go down • due to bug, denial of service, or spike • RHS triggers diagnostics then restarts • Resin Application Server keeps running
  4. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. RHS  :  Reliability  and  System  Transparency
  5. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. RHS  born  from  need • Idea for RHS came from doing Resin support • Thread lock? Can you do a thread dump when you see the problem? • Running out of memory? Can you do a heap dump? • How is your machine configured? What version? • What OS?
  6. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. RHS  By  Engineers  for  Engineers
  7. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Major  features  of  Resin  Health  System  (RHS) • Ability to respond to problems • Detect JVM and OS issues • Avoid zombie processes • Restarts Resin if there are major problems • Internal monitoring • Resin Internal WatchDog Thread • Watchers internal meters for problems • Periodic Thread • External Monitoring • Resin WatchDog Process • Uses process control, socket connection and periodic ping to determine up time status • Advanced Reporting PDF • Post-mortem analysis • Thread Dump/Log Dump • Meters and Graphs • Heap Dump
  8. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. RHS  Tracks  Metrics • Metrics are things like Available Memory, Number of Requests Per Minute, Garbage Collection Time, CPU Load, etc. • Metrics can be graphed • Tracks Historical Data for Trends • Can determine Anomalies • Can determine Trends • Can compare current data with baseline data
  9. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. VisualizaFon • You can view data that Health System collects • Resin Web Admin • Watchdog Report • Post mortem PDF Report • Snapshot Report • PDF Report you can generate anytime • Trigger: CLI, REST, Through Web Admin
  10. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. RHS  and  Web  Admin
  11. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. RHS:  Health  Checks • RHS is highly configurable • Similar to the Resin's "URL Rewrite" rules • Rules are configurable • checks, • conditions, • actions • Internal Watchdog periodic checks
  12. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Watchdog  process • Lightweight process : Used to stop and start Resin instances • Can restart an instance if Java Monitoring / Server Monitoring / Health issue • Parent process of Resin Server • Opens socket to Resin Server • Sends are-you-alive ping? Watchdog Process
  13. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Watchdog  Non  Stop  Mode
  14. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Watchdog  Non  Stop  Mode • Resin is resilient • If a Denial of Service or unexpected Spike or Bug knocks down JVM, Resin restarts • Beyond that Resin can detect critical problems and do critical diagnostics so DevOps and Developers can get to root of problem • Resin long been product of choice for embedded devices, network appliances and large deployments • Non Stop mode makes Resin perfect for cloud deployments
  15. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Resin Watch-Dog Watchdog Process
  16. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Resin Watch-Dog Watchdog Process
  17. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Resin Watch-Dog Resin Process Ownership TCP Link Starting Resin Watchdog Process
  18. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Resin Watch-Dog Resin Process Ownership TCP Link Non-Stop Up State Watchdog Process
  19. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Resin Watch-Dog Resin Process Ownership TCP Link Non-Stop Up State Watchdog Process
  20. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Resin Watch-Dog Non-Stop Up State Watchdog Process
  21. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Internal  Watchdog  Thread  Inside  of  Resin Resin Process Watchdog Process Resin Health System Watchdog Thread
  22. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Internal  Watchdog  Thread  Inside  of  Resin Resin Process Watchdog Process Resin Health System Watchdog Thread
  23. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Internal  Watchdog  Health  Thread Resin Health System Watchdog Thread • Runs inside of Resin Server • Runs periodically • Collects data • Collects baseline data • Executes series of checks • Recheck failed conditions • Perform actions when conditions are critical or fatal
  24. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Resin  Java  CDI  /  CanDI  and  Resin  Conf  based • RHS configuration extends Resin configuration file resin.xml • RHS uses CanDI (Resin’s Java CDI) • create and update Java objects, • XML tags exactly matches either a Java class or a Java property • CanDI means classes and config is in JavaDocs • Use HealthSystem JavaDoc • Use JavaDoc of the various checks, actions,
  25. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Java  Doc  /  XML  conf  of  RHS • Startup delay : wait for baselined date before recording • Period: how often to check metrics • Recheck period: if some level has been crossed how often should RHS recheck to see if better
  26. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Types  of  Health  Checks
  27. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Health  Checks  produce  Status
  28. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Resin  Checks  and  Responds
  29. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Health  System  AcFons
  30. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. AcFons  based  on  condiFon • Actions can be grouped • If in critical state for two minutes perform group of actions • Dump JMX values, Dump Threads, Dump Heap, CPU Profile, Restart • If actions longer than 10 m, restart
  31. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Collect  data  needed  to  diagnose   • When something goes wrong • Denial of Service Attack • Application Bug • Unexpected Spike • RHS collects metrics you need to diagnose problem • Without collection, you are flying blind Bug Denial of Service Spike
  32. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. AcFons  beQer  than  just  watching
  33. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Watch  dog  report  (PDF) • Post Mortem Report • Environment Info • Server Metrics • JVM Metrics • Thread Dump • Heap Dump • Metrics Graph
  34. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Watch  dog  report  (PDF) • Post Mortem Report • Environment Info • Server Metrics • JVM Metrics • Thread Dump • Heap Dump • Metrics Graph
  35. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Environment  Data • Collect critical information about environment • When, • What OS, • What version of Resin • How did Resin startup • And much more
  36. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Health  Status • Status of Health Checks in Report
  37. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Recent  Errors  and  Warnings • Recent Errors and Warnings
  38. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Anomalies • Health Checking stores baseline • Anomalies are configurable triggers based on large changes from expected baseline • Anomaly detection is configurable can trigger actions
  39. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Understanding  Anomaly  DetecFon
  40. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Understanding  Anomaly  DetecFon
  41. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Types  of  Metric  Graphs  in  Report • Cluster Status • Request Count • Request Time • HTTP Request Errors • Log Warnings • Threads • CPU Usage • Database Connection Active • Database Query Time • NetStat • JVM Memory • Heap Used • Tenured Used • PermGen Used • GC Time
  42. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Sample  Graphs  Memory  and  GC  Time
  43. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Sample  Metric  Graphs  Request
  44. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. GC  and  Memory  Metrics
  45. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Heap  Dump • Heap dump critical for tracking down memory leaks • Also generates hprof file which can be analyzed by many third party tools
  46. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. CPU  Profile  /  Thread  Dumps • Critical for debugging thread deadlock issues
  47. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Snapshot  report • Reports same type of data as watchdog • Watchdog report is a post- mortem analysis • Snapshots are whenever you feel like • e.g., during a stress test • trigger via REST, CLI and Web Admin
  48. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Conclusion
  49. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. More  Background  Info  About  Health  System • Resin Health System : Java Monitoring and Server Monitoring built into Resin Application Server • Resin Health System : Current and Into the Future • Resin Application Server Fulfills Vision of Cloud Computing • Resin Health System Enhancements
  50. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. More  Info • Caucho Technology | Home Page • Resin | Application Server • Resin | Java EE Web Profile Application Server • Resin - Cloud Support | 3G - Java Clustering • Resin | Java CDI | Dependency Injection / IoC • Resin - Health System | Java Monitoring and Server Monitoring • Download Resin | Application Server • Watch Resin | Application Server Featured Video
  51. Caucho Home | Contact Us | Caucho Blog | Wiki

    | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. Resin  Java  ApplicaFon  Server