I gave a presentation to PHP London and Scalecamp UK on the techniques used at Assanka to aggregate and debug live runtime errors in large scale PHP applications.
• Error reporting set too low • Open source projects often riddled with them (eg. Wordpress) § Standard error output can be difficult to debug § Displaying raw errors to end users is really bad § Finding error patterns on live applications running across multiple servers requires tedious manual aggregation of logs § Matching errors to user feedback can be a pain § Finding evidence in logs can be very tough
every serious error § Display something sensible to the end user that they'll understand § React appropriately depending on the environment, dev vs live • In dev, stop on EVERY error § Provide as much debug information as possible to help the developer solve the problem quickly
and file § Code context (three lines either side of the error line) § Variable context (variables and objects defined in error scope) § Globals and superglobals § HTTP request details (GET, POST, headers) § Backtrace
Paths may vary for what is basically the same error § Serialise § Hash it • 8 character string of CRC32 • MD5 was too long • Customer needs to be able to read it out over the phone • No significant collisions yet (25,000 hashes recorded) § Same error on different servers will produce same hash
know: • User's cookies • Screen size • Viewport size • Installed plugins • Presence of proxy/firewall • Browser and OS • What they did § Often you need to know this stuff to resolve an issue that is not firing any errors (eg layout issue) § Diagnostics app collects data, files with the bug report
you need to • Aggregate when necessary § Log locally, pull logs into central logging store on a schedule § Set up your own centralised remote logging service and log to it directly • Third party tools include Splunk (www.splunk.com) § Use a third party remote logging service • Loggly (www.loggly.com) • Gmail? (for the financially challenged! But has great search J)
execution) • Customer facing or developer facing? • HTML or plain text? • Has any output already been sent to the browser? § Log it • Locally or remotely? § Report it • Send debug data to a bug tracker
• error_log directive • Or: DIY solution to log more detailed info • We write one file per error occurrence, plus a summary line § Sending to a remote service • Useful for multi-server setups • Aggregate error occurrences from lots of servers • UDP good for this - fire and forget • Listen on more than one logging server for HA
$context=array()) { // If the error has been suppressed using the @ operator, return if (error_repor8ng() == 0) return; // If there are no ac8ons defined for this kind of error, return if (!($errno & (self::$ac8on['log'] | self::$ac8on['stop'] | self::$ac8on['index'] | self::$ac8on['report']))) return; $backtrace = debug_backtrace(); § This is only called for errors, not Exceptions § Exceptions pass only an object, so we need to extract the data from the Exception object § Default Exception class does not capture context § So we need a custom Exception to capture context
method_exists($context, 'getTrace')) { $backtrace = $context-‐>getTrace(); } else { $backtrace = debug_backtrace(); } § Entire Exception object passed to the error handler as the context § Backtrace must be captured from the standard Exception method § Our context data is still within the custom Exception, and will be enumerated with the rest of the vars in the debug report § Standard Exceptions thereby also supported (without context)
set_error_handler § Use register_shutdown_function to call a function before exit § This is called even if the exit is due to a fatal error public sta8c func8on fatalErrorShutdownHandler() { $error = error_get_last(); if (!empty($error) and $error['type'] === E_ERROR) { self::reportError(E_ERROR, $error['message'], $error['file'], $error['line']); } }
$_GET, $_SESSION, backtrace, context, globals etc § Iterate over it recursively § Abbreviate and simplify • Objects become arrays • Shorten long values (and large arrays) § Identify references • Set _errorhandler_objid property of all arrays and objects as they are processed • If it’s already set, this is a reference to a value we’ve already indexed.
JSON § Save to a file § Fork a process (popen) to upload it to the bug tracker • Ours sleeps if one is already running, waits for a gap § Upload using HTTP POST (cURL)
reporting § Don’t use @ to suppress errors § Handle unexpected inputs with in-application errors, don’t fall back to an error handler. § Use trigger_error to fire errors intentionally (eg for use of deprecated code)
Unavailability of third party web services § File system permissions § Out of memory § Unexpected input / inadequate input validation § Manually triggered alerts