$30 off During Our Annual Pro Sale. View Details »

Shattering your application into an umbrella pr...

Elle Imhoff
February 11, 2017

Shattering your application into an umbrella project

With the announcement of Phoenix 1.3 supporting umbrella projects at ElixirConf 2016, I was inspired to start early on converting our Phoenix 1.2 application to umbrella. Since the umbrella project templates weren't publicly available yet, I had to find the seams in our single, monolithic application where I could start breaking it into OTP apps for the umbrella project.

I'll cover the thought process of how I broke up the app and what guidelines I use now when adding new code as to whether add it to a pre-existing application or to make a new OTP application. I'll share the common code I was able to share between our RPC server and API controllers, which I believe is a good behaviour for resource module in Phoenix 1.3.

Elle Imhoff

February 11, 2017
Tweet

More Decks by Elle Imhoff

Other Decks in Programming

Transcript

  1. Shattering your application into an umbrella project Luke Imhoff [email protected]

    Github: @KronicDeth Twitter: @KronicDeth ☂ 1 Hi, I'm Luke Imhoff. I'm known just about everywhere on the internet as Kronic Deth. I'm the maintainer of the IntelliJ Elixir plugin for JetBrains IDEs and the host of the Austin Elixir mmetup, but my talk today is about my work at Communication Service for the Deaf.
  2. Interpreter Vineya 2 server server server worker Lighthouse worker RabbitMQ

    Postgres Postgres Postgres Partner In September 2015, CSD started a new Phoenix 1.0.2 project. It contained a single OTP application, :interpreter_server with a lib, test, and web directories. Between then and ElixirConf 2016, about a year later, I joined CSD and the rest of the team and I slowly built up that Phoenix project: adding support for talking to other services, partner-server and lighthouse-server, using RabbitMQ and talking to Ember front-ends using JSONAPI. Lighthouse is for authentication: storing users and issuing JSON Web Tokens for all the front-ends to authenticate to the other services. Partner server is for American Sign Language Interpreting agencies to use, where they can schedule jobs between clients, businesses, and ASL interpreters. Interpreter server is for ASL interpreter use, where interpreters can accept job assignments from the agencies. Together the services form Vineya.
  3. • Started a normal Phoenix project • Heard about umbrella

    projects in Phoenix 1.3 • No idea how to go from normal to umbrella 3 At ElixirConf 2016, Chris McCord gave his keynote on Phoenix 1.3, where he mentioned that the web directory would disappear and that optionally, umbrella projects could be used to separate the web interface from repository access or other business logic. I was excited, but the templates for creating a Phoenix 1.3 project wasn't even on a PR yet, so I was stumped. I had never used umbrella projects before. Thankfully, I was in luck, Wojtek [pronounced Voy-tech] Mach was giving a presentation on umbrella projects. I recommend watching Wojtek's talk if you haven't already. Chris and Wojtek's talks left me with the question of how do I get to there from where I was: starting with a single application.
  4. ☂ root •mix.exs •apps/ 4 From Wojtek's (pronouce: Voy-tech's) presentation

    I learned that umbrella projects still had a top-level mix.exs, but all the code lived in an `apps` directory. Each directory under `apps` held a separate OTP application, which is just what the root directory of a normal project was, so the gist is move the root down into `apps` and make more mix projects under `apps`.
  5. →☂ app Before After config apps/my_app_web/config lib/my_app apps/my_app_web/lib/my_app_web lib/my_app.ex apps/my_app_web/lib/my_app_web.ex

    priv apps/my_app_web/priv test apps/my_app_web/test web apps/my_app_web/web mix.exs apps/my_app_web/mix.exs 5 The first step is to put your project's root project under `apps`. This will involve a bunch of git moves to preserve history. Phoenix 1.3 will use the "_web" suffix for this OTP app, so we'll do the same here. You need to do the move first, so you create a new root layout, which would otherwise cause file name collisions.
  6. ☂ app mix.exs defmodule MyAppWeb.Mixfile do use Mix.Project def project

    do [ # ... build_path: "../../_build", config_path: "../../config/config.exs", deps_path: "../../deps", lockfile: "../../mix.lock", # ... ] end end 6 After moving your `mix.exs` from the root of the project to the `apps/my_app_web` directory, some of the paths need to be rewritten to point back to the root of the umbrella project. In umbrella projects, the build path, config path, deps path, and `mix.lock` file are shared. It's one of the main things that separates OTP apps in an umbrella from just using path based dependencies. If you do `mix new` insides the `apps` directory, these path rewrites happen automatically, but for your moved `mix.exs`, which wasn't setup in an umbrella project, you'll need to change the paths manually.
  7. ☂project mix.exs defmodule MyApp.Mixfile do use Mix.Project def project do

    [ apps_path: "apps", build_embedded: Mix.env != :dev && Mix.env != :test, deps: deps(), start_permanent: Mix.env != :dev && Mix.env != :test, ] end # Dependencies listed here are available only for this project # and cannot be accessed from applications inside the apps folder defp deps do [] end end 7 The new root needs to be a mix project, but unlike a full OTP app, the `mix.exs` `project` function will only have 4 options. 3 options (`build_embedded`, `deps`, and `start_permanent`) will be the same as a non-umbrella project, but `apps_paths` will be new and set to "apps". Eventually, you may want top-level deps, such as credo or dialyze/dialyxir, which won't work from the root directory without being listed here.
  8. ☂ project config/config.exs use Mix.Config import_config "../apps/*/config/config.exs" 8 Although each

    OTP app has it's own `config` directory with its own config files, the overall config for the project is unified: the top-level `config` directory contains files that include all the OTP apps' configs while the individual umbrella app's `mix.exs` refer back to the top-level config directory to load their configs.
  9. ☂ project config/config.exs # Configures Elixir's Logger config :logger, handle_otp_reports:

    true, handle_sasl_reports: true config :logger, :console, format: "$time $metadata[$level] $message\n", metadata: ~w(correlation_id queue request_id user_id)a 9 Since you're going to use the logger in all your OTP apps, you can eliminate divergent configurations by configuring it in the root `config/config.exs` file.
  10. Shattering my_domain my_app_web Channels ✓ Controllers ✓ Ecto Schemas ✓

    Ecto Repo ✓ Migrations ✓ Router ✓ Templates ✓ Views ✓ 10 After all the moves, renames and replacements, you're left with an umbrella project with a single OTP app in its `apps` directory. It's more complicated, not less than what you started with, and you have wonder what the point was. Well, umbrellas only start to pay-off when you start to break-up that single app, but the question is where's the first crack to start shattering the web application? From Chris McCord's keynote on Phoenix 1.3, I knew that the `--umbrella` option for `phoenix.new` was going to place the `Ecto` module, both the Repo and Schema in an OTP app separate from Phoenix.
  11. Domain app •Owns Repo •Owns Database •Owns Schemas 11 Putting

    the Ecto Repo and Schemas in their own OTP app isn't enough if it is just treated like a different namespace. We want to be able to test and use the domain logic without the need for Phoenix controllers. Why? Well, you may not think about it that often, but any Phoenix project already has two UIs: the Phoenix API presented to the web and the UI we as developer, devops, or maintainers need to interact with from `iex` when developing, debugging, and supporting our projects in production.
  12. Domain module callbacks • allow_sandbox_access(token) • changeset(params) • changeset(resource, params)

    • delete(struct) • get(id, query_options) • insert(params, query_options) • insert(changeset, query_options) • list(query_options) • sandboxed?() • update(changeset, query_options) • update(resource, params, query_options) 12 After a lot of refactoring cycles, I eventually came up with a transport-neutral behaviour that can hide whether we're getting data from Ecto, RabbitMQ, or even local GenServers that back `port`s running SSH client tunnels: Calcinator.Resources. This is only my supposition of what domain modules could look like in Phoenix 1.3 based on Chris McCord's examples. The behaviour supports the controller-action-like callbacks, but also support for testing with sandboxing. Certain callbacks, like changeset, insert, and update have two forms to allow for optimizations when calling in the controller-like Calcinator module. `query_options` encode common options such as pagination, sorting, filtering, and associations to include in the response. Calcinator is targeted to JSONAPI, but `query_options` are targeted at Ecto's encoding for params and associations instead of JSONAPI's include format to better separate layers. You may find this list helpful, but it may be more or less than you need when creating your own domain modules.
  13. Domain module returns • {:error, :ownership} • {:error, :timeout} •

    {:error, :bad_gateway} • {:error, :not_found} 13 Since I wanted Calcinator to work with Ecto, RabbitMQ, and really any backing datastore, the returns are more complicated than Ectos as I want to be able to handle more error conditions without having to know about each datastore's exceptions. {:error, :ownership} is good for any ownership errors during testing as it allows the error to be surfaced as an API error instead of an exception, so `ConnCase`s can show the errors instead of them appearing in Logger output. Likewise, {:error, :timeout} allows for GenServer timeouts to be shown in `assert` `response` failures instead of appearing in the SASL log. I found doing `{:ok, struct}` OR `{:error, :not_found}` to be easier to match in a `with` than `{:ok, struct}` OR `nil`, so I recommend that for get-like calls.
  14. interpreter_server • Multiple UIs • JSONAPI over Phoenix • JSONAPI

    over JSONRPC over RabbitMQ • iex • Observer • Multiple Backing Stores • In-memory • Ecto • RPC • Redis 14 So, for a general, platonic Phoenix project that's as far as I can advise how to break up your project: a domain OTP app and the Phoenix web OTP app, but let me cover the specifics of CSD's own project to give you some more ideas for how to break up your own project when converting to an umbrella project. InterpreterServer is CSD's project for allowing sign-language interpreters to find and track jobs from multiple agencies. On the front-end, it uses Ember to talk to Phoenix controllers that respond with JSONAPI. On the backend, it uses Ecto to talk to the database it owns and talks over RabbitMQ to background processes running RPC servers that can access databases that two Ruby on Rails servers own. For debugging purposes we can access the backend using SSH tunnels held in memory that allow either iex or Observer remote connections. The Ember front-end uses ember-cli, so to keep the publishing consistent with the Ruby on Rails servers, the Ember front-end is published from a Redis cache. So, the cracks in interpreter_server_web, our Phoenix OTP app became obvious when I list it out like this: each UI should get its own OTP app and each backing store should get its own OTP app. Let's see how that actual is implemented
  15. UI OTP app Shared interpreter_server_jsonapi Observer interpreter_server_observer RabbitMQ interpreter_server_rpc Phoenix

    interpreter_server_web 15 When shattering your web OTP app into more pieces, you may end up with some forced separations because some code is needed in two OTP apps that no longer share a common dependency. This was the case for us with interpreter_server_jsonapi. It includes view modules that are common to interpreter_server_rpc and interpreter_server_web. Observer already has its own UI, but the steps necessary to connect over SSH port forwards to containerized hosting also needs a good user experience, so interpreter_server_observer contains an interactive walkthrough that walks any developer through the commands to copy and paste back and forth between a local iex session and a remote console on our hosting to get Observer to connect to a production or QA container instead of the one setup by the remote console. This approach may work on Heroku, but we don't use Heroku, so if someone that uses Heroku wants to adapt my work let me know. interpreter_server_rpc owns the connection to RabbitMQ and supervises all the RPC servers that expose the database owned by interpreter_server to the Ruby on Rails applications. The app also includes RPC clients for interacting with resources owned by those Ruby on Rails applications. interpreter_server_web has been pared down to just the controllers and views for the Ember frontend. The controllers are minimal because the Calcinator package makes them mostly declarative with a few minor plug functions for authentication or authorization. interpreter_server_web has views not in interpreter_server_jsonapi because there are parts of authorization that are mixed into the view-layer, such as role-based hiding of fields that doesn't apply for inter-backend communication through interpreter_server_rpc. During the shattering of interpreter_server_web, authorization was one of the hardest aspects to disentangle into its own application and I was never able to make it completely its own layer because there's no good way with Ecto.Schema structs to indicate that a field was censored except in the view layer, so Authorization ends up being mixed into both the Controller/RPC server layer for gross authorization of entire structs and the view layer for authorization of individual fields.
  16. Store Access OTP app Redis ExRedis ember Postgres Ecto.Repo interpreter

    Postgres Ecto.Repo lighthouse Postgres Ecto.Repo partner Memory GenServer ssh_tunnel 16 Our ember app handles fetching, setting, and invalidating the Redis cache for ember-cli. interpreter is for the data owned by interpreter_server and accessed using an Ecto.Repo connected to Postgres. lighthouse and partner are the two Ruby on Rails application. Their Elixir applications contain only their Ecto.Schemas in production, but as way of speeding up integration tests we have a Repo in each that we use to truncate the Rails database out from under the Ruby RPC servers as it is faster than using rake to have Rails do it for us. As you can see, there are multiple Postgres OTP apps. This is because I would not recommend having the store format decide how to group code into your OTP apps; instead, I'd shatter the OTP apps based on the owner of the data or it's use case. So, if we were to have a new use case for Redis, I would make that a separate OTP app from ember. If some of the code in ember turned out to be useful to the new Redis using OTP app, I'd move that code to a 3rd OTP app that both could share. The 3 Postgres-backed OTP apps reflect the fact that each one is using a distinct database. This can in-theory allow you to deploy each OTP app at different rates, but we're not to that stage of eating the Ruby on Rails applications' lunches yet. ssh_tunnel is probably the most classic OTP app: it is a supervision tree of GenServers that track pids of ports to external OS processes. It also includes an interface for setting up SSH keys. If you're wondering why that's necessary when SSH is a library in the Erlang standard library, well, it's because the ssh in the Erlang standard library is incomplete and really out of date, so it doesn't handle port forwarding and most modern SSH server implementations will reject Erlang ssh as a client because it only supports old and insecure authentication mechanisms, so I found it simpler (because our hosting container happened to have a working ssh client lying around) to use OS provided ssh instead of Erlang ssh.
  17. Ecto • Repo • database access • Changeset • Type-casting

    • Validation • Change tracking 17 I need to emphasize this, because it's important when searching for a validation or params casting library: Ecto is both a way to talk to your database, using Ecto.Repo, but also, and far more generally useful, it is a way of validating params (even when they don't come from the internet through Phoenix) and tracking changes to structs. At CSD we use Ecto for converting to and from params in Retort for doing RPC over RabbitMQ, in ssh_tunnel for making in-memory tracked ssh client processes, and for it's more common usage of accessing interpreter's Postgres database.
  18. Stages of OTP apps 1.Namespace 2.Separate OTP app 3.Separate Repository

    4.Hex Package 18 When applying these considerations to your own project, understand that making a new OTP app in your umbrella project can be just an intermediate step on the way to making the code a separate, distinct Hex package. Not all code needs to or should become a hex package though. A namespace doesn't need much justification beyond you got tired of repeating a bunch of prefixes in your function names or you need some place to put a defstruct. Moving to a separate OTP app in an umbrella project, the contents of that app need to be testable and usable on their own. If you move a namespace into its own OTP app, but you can't test or use it effectively from iex without bolting another OTP app on top, then it's probably not worth being a separate OTP app. Going from a separate OTP app to a separate repository, has both pros and cons. The pro, from my personal experience is that you can shed build time by only compiling, testing, and dialyzing that separate repo when it actually changes, as opposed to needing to compile, test, and dialyze all OTP apps within an umbrella project. The con of this is that you have to do the standardized-coordinated-release-update-dance when it turns out you have a breaking change in that repository that requires updating all the downstream repositories. Jumping from a separate repository to a hex package first requires that the repository is publicly published. It also means increased duties as hopefully, you're publishing to hex because you want community usage of your package. This involves a dedication of time to support the package and any community that evolves around it. It is perfectly OK for any OTP app to stop at any of these stages. Sometimes, like for interpreter_server_jsonapi, which just contains views specific to interpreter_server, anything beyond a separate OTP app actually makes maintenance harder.
  19. interpreter_server RPC Server Controllers JSONAPI JSONAPI JSONRPC HTTP RPC Clients

    JSONAPI JSONRPC Ecto.Repo RabbitMQ RabbitMQ Ecto.Repo 19 From interpreter_server, CSD has open-sourced 3 packages: Alembic, Calcinator, and Retort. The first package, Alembic, deals with JSONAPI format validation. As you can see from the platform diagram, JSONAPI appears a lot in interpreter_server. We spotted this so early in the design of interpreter_server, that Alembic jumped straight from a namespace to an independent hex package without going through the intermediate steps of being an OTP app in an umbrella project. So, this type of component is easy to pick out: find all the places where you interact with the same encoding format and make it a library.
  20. interpreter_server RPC Server Controllers Alembic Alembic JSONRPC HTTP RPC Clients

    Alembic JSONRPC Ecto.Repo RabbitMQ RabbitMQ Ecto.Repo 20 Sometimes, to find common code that can extracted into another OTP application, you need to start ignoring the actual data and structs involved and instead look for common transformation pipelines.
  21. RPC Servers Controllers Ecto SSH Ecto RPC Client handle_method(:index, ...)

    handle_method(:index, ...) index index Authorize index Authorize index Client Repo.all tunnels() Repo.all Client.index Authorize list Authorize list result result render render 21 Zooming in on the RPC Servers and Controllers, you can see there were two types of RPC servers and two types of Phoenix Controllers. The obvious place to unify the servers and the controllers is when they use Ecto, but that leaves the SSH and RPC Client ones out. In this view, we'll concentrate on listing resources with the index action and method. So, I could kinda make their stages line up, but the interfaces just didn't mesh well: • Controllers depend on Plug.Conn and plug pipelines. while RPC Servers use the pipeline operator with its own request-response structs • Only controllers do authorizations because users permissions are checked in the front-end facing API • Some data is stored in Ecto.Repos, some in-memory for SSH tunnels and some is remote and only accessible after spawning an RPC client pid. • Finally the output of the result function, maps, is incompatible with the render function rendering to Plug.Conn
  22. RPC Servers Controllers Ecto SSH Ecto RPC Client Action handle_method(:index,

    ...) handle_method(:index, . ..) index index Authorization Authorize index Authorize index Resources Client Repo.all tunnels() Repo.all Client.index Authorization Authorize list Authorize list View result result render render Rendering 22 So, there are a couple techniques I combined to be able to extract out Calcinator: first, combine nomenclatures. JSONRPC may call it a method, but those methods need to support the same operations as a normal JSONAPI controller, so just settle on the controller nomenclature of "action". Next is the issue that RPC servers don't do the authorization, only the controllers do, so borrow the NullObject pattern from OO and have a default Authorization module that does no checks. Third, the RPC- Client-backed controllers have an extra step of getting a client first and then using it to get the list of structs, but if we think about it Ecto.Repo is really hiding the connection management from us, so we can group those two rows together under Resources. For the authorization of individual structs in the returned list we'll use the same NullAuthorization for RPC servers. Finally, and this took awhile to realize: the result and render row couldn't be broken up because the convenience of Phoenix.Controller.render is hiding the fact that it is both calling the view module and then encoding the view output. To make a transport neutral system, these two steps need to remain separate, so that the common format of JSONAPI `map`s can be correctly injected into either an outer JSONRPC `map` in the RPC servers' case or `Poison` encoded directly for the `Plug.Conn` response.
  23. Calcinator.index def index(state = %__MODULE__{ ecto_schema_module: ecto_schema_module, subject: subject, view_module:

    view_module }, params, %{base_uri: base_uri}) do with :ok <- can(state, :index, ecto_schema_module), :ok <- allow_sandbox_access(state, params), {:ok, list, pagination} <- list(state, params) do {authorized, authorized_pagination} = authorized(state, list, pagination) { :ok, view_module.index( authorized, %{base_uri: base_uri, pagination: authorized_pagination, params: params, subject: subject} ) } end end 23 The steps of (1) calling the action, (2) authorizing the action, (3) getting the resources, (4) authorizing the resources, (4) rendering the view and (5) returning for encoding are precisely represented in Calcinator.index. The only addition that I didn't mention was support for sandbox access for testing. The index action body only needs to include the happy path because of the :ok matching in the `with`. All sad-path errors are unmatched and so fall out of the `with` to be handled by the caller.
  24. Calcinator.Controller.index def index(conn, params, calcinator = %Calcinator{}) do case Calcinator.index(

    %Calcinator{calcinator | subject: get_subject(conn)}, params, %{base_uri: base_uri(conn)}) do {:ok, rendered} -> conn |> put_status(:ok) |> put_resp_content_type("application/vnd.api+json") |> send_resp(:ok, Poison.encode!(rendered)) {:error, :unauthorized} -> forbidden(conn) {:error, document = %Document{}} -> render_json(conn, document, :unprocessable_entity) end end 24 Error handling is left to either the RPC server or Phoenix controller because JSONRPC has some predefined error handling where errors need to be lifted out of the JSONAPI and into the JSONRPC, while HTTP Statuses in the response only work for controllers and HTTP, but it becomes a simple `case` on `:ok` and `:error` tuples with Calcinator actions doing the heavy lifting.
  25. 25 Pro Con • Can release subset of OTP apps

    • Only need to test changes subset • Path to hex packages • Parallelize dialyze and test • Docker requires `- w` run in apps directories • Need to coordinate deps updates in multiple apps Converting to an umbrella project isn't all sunshine and roses: * if you use docker, it always assumes the root directory for inside the container, so you'll need to pass `-w` to change the current working directory to work in the individual OTP apps. * `mix test` behaves differently from the root directory and we still have issues where there's a race that the repository is already connected only when the `test` alias tries to drop the `Repo` when `mix test` is run from root. Those cons though have been far outweighed by the ability to run `mix test` and `mix dialyze` on each OTP app and eventually being able to open source pieces, which gets them into separate builds.
  26. →☂ • Keep your domain data stores in separate OTP

    apps • Hide how your is stored from other OTP apps • Extractable OTP apps/libraries can be obscured by conveniences in other libraries. • Each UI should do the minimum work necessary in separate OTP apps 26 When shattering your own project, identify your independent data stores. This isn't the backing technology, such as Postgres, in-memory, or Redis, but data specific to a given domain or user-base that may have independent sourcing or scaling characteristics. You want to hide the backing technology because you may want to change it to optimize for search, caching, Command Query Response Segregation, or any number of other changes in format. Some conveniences from libraries, such as Phoenix.Controller.render or a `use` statement that defines the majority of a module can obscure commonality in your own apps. Dive into your dependencies' code and understand what they are generating and calling on your behalf to see if you can stop repeating yourself and extract an OTP app more tuned to your projects needs by jumping down a layer and calling parts of the library directly, using more function calls and less declarative code. In general, assume that declarative code, such as `use` statements should be there as a convenience for new users that want the library as the final layer of their project, but if you need to build upon a library, look for the functions that macros are calling instead of using the macros directly. Finally, separate your UIs into different OTP apps. This allows you to potentially exclude entire OTP apps from releases that don't target a given UI and it can also point out pieces that really should be in the domain-specific OTP apps if you keep having to repeat code in the UI OTP apps to make the domains usable, this includes if it's too much work and you need to copy from a lab notebook whenever you want to do anything in iex.
  27. Questions? 27 Luke Imhoff [email protected] Github: @KronicDeth Twitter: @KronicDeth Elixir

    Slack: @KronicDeth Elixir Forum: KronicDeth IRC: KronicDeth I hope this guidance can help lead you into the bright shiny future of umbrella projects. If you need any help, I'm Kronic Deth on Elixir Slack, ElixirForum, IRC and twitter, so don't hesitate to ask for help