Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Chat Bot: A Practical Walkthrough of the Power of Elixir/Erlang/OTP

Jeff Weiss
April 24, 2015

Chat Bot: A Practical Walkthrough of the Power of Elixir/Erlang/OTP

Published with speaker notes, because it contains hints to the audience interaction.

Jeff Weiss

April 24, 2015
Tweet

Other Decks in Technology

Transcript

  1. Chat Bot: A Practical Walkthrough of the Power of Elixir/Erlang/OTP

    Jeff Weiss @jeffweiss Puppet Labs, Inc. Hi, I’m Jeff. I work at Puppet. I’m here to talk about how I’ve used a side project to work through some of the more powerful features of the Erlang runtime system and their use from Elixir. I know this has to be the longest, most boring title ever, so…
  2. Beware: Live Demos Ahead Hang in there. I have live

    demos later. And those *never* go wrong. But first a few caveats…
  3. I still consider myself an Elixir and Erlang beginner. I

    will probably tell you things that are incorrect. I’m sorry—not intentional. I thought I understood a lot of the mechanics of how to use various aspects, but I didn’t (and to some respects still don’t) have a visceral understanding of when to migrate from one thing to another and which pain it alleviates.
  4. Novice to Expert John Allspaw, “On Being a Senior Engineer”

    http://www.kitchensoap.com/2012/10/25/on-being-a-senior-engineer/ Dreyfus & Dreyfus, “A Five Stage Model of the Mental Activities Involved in Directed Skill Acquisition” http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA084551&Location=U2&doc=GetTRDoc.pdf Novice • Rigid adherence to rules or plans • Little situational perception • No (or limited) discretionary judgment Advanced Beginner • Guidelines for action based on attributes and aspects, which are all equal and separate • Limited situational perception Competent • Conscious deliberate planning • Standardized and routine procedures Proficient • Sees situations holistically rather than as aspects • Perceives deviations from normal patterns • Uses maxims for guidance, whose meanings are contextual Expert • No longer relies on rules, guidelines or maxims • Intuitive grasp of situations • Analytic approach used only in novel situations In his excellent post “On Being a Senior Engineer,” John Allspaw references a knowledge acquisition paper that outlines this model. That visceral understanding I’m after shows up in the Expert level, but I’m maybe somewhere around Advanced Beginner. I needed a non-trivial project that, as it expanded, would necessitate redesign and refactoring, incorporating the things that I had learned, but not yet internalized. So I made a thing. I knew it would be bad, well, sub-optimal, at first. Almost no book and tutorials follow this path. Only after I started did I begin Sasa’s Elixir in Action, which has exactly this approach. He does it far more elegantly than I do. Side note: I have Erlang in Anger on my reading list.
  5. Why a chat bot? • Hubot fork : Kerminator •

    Buggy extensions bring down entire bot • Prodding from coworker As to why a chat bot, at work we use a fork of Hubot: Kerminator. Like Kermit the Frog and the Terminator. Because Puppet… Quick show of hands of people vaguely familiar with Hubot. Hubot is a way to do Chat Ops, particularly so that there’s a public record of how to do various things to quickly bring folks up to speed. That’s the idea at least, except that we tend to use it more for karma, image searches, and ticket references. Hubot’s Node execution model doesn’t provide for any isolation between plugins, so a catastrophic failure of, say, an image search plugin, can prevent deploying new code to production. Erlang’s currency model seemed like a perfect reaction to this issue. Also, I had a coworker who wanted a quote bot for an IRC channel. I thought about it and whipped up something in Elixir in a couple hours thanks to exirc library. I’ll show some of this in a few minutes so…
  6. IRC server • Ad hoc network: jeffweiss • Host: 169.254.1.2:6667

    • Channel: #elixirconf • Sample commands: • elixir++ • markov Joe Armstrong • crash The seed text was “Joe Armstrong went to the future in the TARDIS” You could try “markov Joe Armstrong went to the future” That will be unique enough for it to find only the seed text
  7. Supervision Trees Clustering Live Code Update The 3 main portions

    of the Erlang runtime system that I’ll cover are: Supervision trees, clustering, and live code updating. Again, none of these are specific to Elixir, but as a n00b to Elixir, I was also a n00b to Erlang.
  8. This is how I initially thought about supervision trees, However,

    OTP provides a much richer framework than just this.
  9. First Supervision Tree This was my first supervision tree for

    the app. It was adequate, for a while. If I had a problem with a bot and it crashed, the bot restarted. Where it breaks down: errors from the underlying IRC library. I’m not speaking ill of Paul Schoenfelder’s exirc library. I could not have done this project without it, but when I run on a flaky connection, like my laptop, opening/closing, moving from wireless AP to wired connection, etc, exirc doesn’t handle it very well, and I basically have to restart the client connection in the library to get it to reconnect to the irc server. The exirc client has a reference to each one of the bots (exirc calls them handlers). Each bot or handler registers itself with the exirc client process. So a failure of any particularly bot is fine, it’s restarted and it just registers itself with the client process; however, if the client process fails, each of the bots don’t know they need to reregister themselves.
  10. A few options • Stop running an irc bot from

    laptop, dummy (done) • Restructure supervision trees so exirc death resolves itself (done) • Fix reconnect logic in exirc You also can’t have multiple restart strategies, which we’ll talk about momentarily, in a single supervisor, so my resulting supervision tree looks like…
  11. Current supervision tree I am also using something other than

    `one_for_one`. because whenever the client restarts, I need to restart all the bots so that they’ll automatically reregister themselves with the new client process. This is a hack. I know. But it alleviated the greatest pain I had at the time. When it comes time to have multiple communication transports like irc and xmpp, then I’ll likely revisit this section. So, I’m using `one_for_all` for this portion of the subtree, and then for the individual bots, I’m back to `one_for_one`. I could have been strict about the order of the worker processes and put them all after the client process and then used `rest_for_one`, where the remainder of the list is restarted if any of them failed, but 1) it seemed a little brittle, and 2) it seemed like I might restart bots it another bot failed, and I didn’t want that.
  12. Clustering $ iex --name <yourname>@<yourip> --cookie yum iex(1)>Node.connect :”[email protected]” Here’s

    where this talk starts to get interesting / go wrong… I need your help What the hell just happened? You connected to the cluster, I sent the source for a module to your node, compiled it, and then began executing code from that module. One of you were elected as a “spammer” and began interacting with bots and IRC connections from my node. As we can see, ohaibot is privately messaging me. I am also receiving Logger messages originating for your nodes on my iex session. This is because my iex session is the “group leader” for those processes. I think that’s a pretty cool default mode. While ohaibot reads me poetry, let’s talk about the clustering security model…
  13. iex(2)> node = :”[email protected]” iex(3)> Node.spawn(node, fn -> apply(Brain.Karma, :increment,

    ["elixir", 50]) end) iex(4)> Node.spawn(node, fn -> apply(Bot.Markov, :markov, ["I say", “#elixirconf”]) end) I ran some things on your machine, it’s only fair that you run some on mine, here are some things that you can execute
  14. Clustering Security Model You may often see people reference Erlang’s

    clustering security model with an empty space. I feel this does the lack of a security model an injustice. Any node in the cluster can execute any code, within Erlang or native to the host machine, on any node in the cluster, with whatever permissions of the user who started the Erlang runtime on that system. Instead of simply piping some text to `say`, I could have harvested your ssh private keys, or AWS credentials, or whatever I wanted.
  15. Rudimentary clustering failover • Simple implementation • Thought about on

    the commute into work the day of a demo • Took about 45 minutes to implement Ok, spammer, I need you to double Ctrl-C out of your iex session… My NodeMonitor notices that you’ve left, that you were the spammer, and elects one of the other nodes to spam, if possible. Notice that we don’t restart at the beginning of all the poems. 45 minutes to implement… I don’t know of another language or runtime where I could have implemented multi-node failover (without an external library) in 45 minutes. Amazing!
  16. Live Code Update • No long running process and/or no

    state impact • Just reload code • State that needs to be migrated • Pause process • Reload code • Run state migration • Resume process Ok, let’s keep the spamming going, because it will be useful for demonstrating the live code update… We have two kinds of code updates that we’ll have to consider… No state considerations: You can just reload the code in the shell. State: You’ll have to run a migration First you suspend the process — I’ve made some helper functions so we don’t wait for me to type everything Run the state migration Finally, resume the process Any messages that sent while suspended and migrating will now be processed.
  17. What’s Next? • (D)ETS backend for Markov brain • XMPP

    transport • Simpler bot creation • Cluster-distributed bots • ??? https://github.com/jeffweiss/ohaibot This is an unordered list of things that would be interesting for me to work on next
  18. Questions ? That’s a quick walkthrough the merely scratches the

    surface of how powerful those aspects of the Erlang runtime system are. What questions do you have?