Lessons Learned: Auto-Remediation & Even Driven Automation
At StackStorm we have had the chance to speak with many different organization about auto-remediation and some of the difficulties they encountered when trying to implement it. This talk was an overview of some of the most common pitfalls.
using system events from your environment to initiate automatic responses. Complex Long Running Workflows Handler Scripts Fired from Monitoring Complexity • Opaque • Hard to standardize • High visibility • Workflow patterns
Outages and failures are automatically remediated based on events emitted by various systems within the environment. Diagnostic Workflow Remediation Workflow Alerting Host/Service
organizations are open to the concept of event driven automation on a high level. Yet Another Tool/Process • Well established teams do not want to add yet another tool/process to their already full plate (Just one more wiki article to maintain).
Everyone has a #badauto story (automation run amok). Yet Another Tool/Process • Well established teams do not want to add yet another tool/process to their already full plate (Just one more wiki article to maintain). Fear of the Change (in process) • Status quo is comfortable Fear of becoming Obsolete • There is a common misconception that automating IT Operations tasks will eliminate the need for certain jobs.
trust the automation, not suspect it first. Automation needs to earn out trust. How? • Facilitated Troubleshooting • Full Collaboration with the end users • Human Controls (a.k.a Nuclear Launch Codes) • Audit Audit Audit
the right tools. • Peer Review Everything • Don't Overcomplicate Things • Don't throw away your old tools, but be ready to throw away (most of) your process
good workflow engine, with an easy to use DSL. • Inventory and service discovery will be critical in workflow construction • All tools must allow you to version control their content and configuration. Version control the world!
version controlled makes peer review infinitely easier • Just like code, all processes need to be peer reviewed. Formalized reviews allow organizations to iterate across their process and improve much faster than they could otherwise
to throw away (most of) your process • Your new fancy workflow tool isn't meant to replace your existing tools (configuration management for example) • Let the workflow orchestrate the tools you already have • Your processes will change.
providing a unified control plane for all of your tools. • Simple Workflow construction • Easy integration with existing scripts & tools (1,000+ integrations in the StackStorm community repos) • http://docs.stackstorm.com • IRC: Freenode#stackstorm • [email protected]