Upgrade to Pro — share decks privately, control downloads, hide ads and more …

RackHD Workflow Engine implementation design

RackHD Workflow Engine implementation design

Workflow engine refactor to utilize ReactiveX models for processing workflow tasks in RackHD. Included with Pull Request in Jan 2016

Joseph Heck

January 16, 2016
Tweet

More Decks by Joseph Heck

Other Decks in Programming

Transcript

  1. 1 © Copyright 2015 EMC Corporation. All rights reserved. RackHD

    Workflow Engine redesign January 27, 2016
  2. 2 © Copyright 2015 EMC Corporation. All rights reserved. Goals

    of the new workflow engine Performance goals •  Entirely database backed (no workflow state kept in memory) •  Horizontally scalable, which enables high availability under load •  Fault tolerant, which leads to enabling high availability failover scenarios Development goals •  Stream based architecture of core components (Reactive paradigm) –  Core components of the engine listen to generic event stream APIs (push rather than pull) –  Enables flexibility in infrastructure decisions: easy to change underlying database and messaging infrastructure based on deployment constraints –  Enables fast response and execution times of workflow tasks •  Modular and extensible •  Backwards compatibility with current code
  3. 3 © Copyright 2015 EMC Corporation. All rights reserved. Implementation

    details Architectural decisions to achieve goals •  HA/Fault tolerance –  Atomic checkout: All eligible Schedulers or Task Runners will receive requests, but only one will succeed in checking out a lease to handle that request. Somewhat like a leased queue model. Leverage existing database technologies (currently MongoDB) –  Lease heartbeat: Workflow engine instances heartbeat their owned tasks, so that other instances can check them out on timed out heartbeats. –  Backup mechanisms for dropped events: Utilize optimized pollers to queue up dropped events for re-evaluation (dropped events happen under high load throttling and catastrophic failure conditons). •  Scalability –  Domains: Workflow instances can be configured to handle different domains of tasks, and can be machine independent as long as the database and messaging are shared. –  Stateless: Horizontal scalability is achieved by designing the processes to run in essentially a stateless mode. The last word is from the database. –  Optimize data structure: update the current data structures and mongo collections/indexes to be optimized for fast querying, improved indexing •  Development –  Reactive: Utilize the ReactiveX libraries for Node.js, design loosely-coupled components joined by stream APIs
  4. 4 © Copyright 2015 EMC Corporation. All rights reserved. Implementation

    details •  Database strategy –  mongo.pxe.taskdependencies: A dedicated collection for task running and graph evaluation. The taskdependency document structure allows for the most of the workflow evaluation logic to be performed with optimized database queries instead of code. Task Dependency document! {! "taskId" : "bd1ff046-a8e5-4587-b5e3-3ac7d5d8a974",! "graphId" : "da5d101d-27f6-4b17-9e90-acad06dfa90c",! "state" : "pending",! "dependencies" : {! "afba1db1-07aa-4ffe-814a-84dbcc02bf9b" : "finished"! },! "terminalOnStates" : [ ! "timeout", ! "cancelled", ! "failed"! ],! "domain" : "default",! "evaluated" : false,! "reachable" : true,! "taskRunnerLease" : null,! "taskRunnerHeartbeat" : null,! "createdAt" : ISODate("2016-01-27T22:12:42.571Z"),! "updatedAt" : ISODate("2016-01-27T22:12:42.571Z"),! "_id" : ObjectId("56a940da7eb4035519b2d6e7")! }! Explanation of fields •  terminalOnStates •  Hints during graph evaluation about whether the graph could be potentially finished •  domain •  Separation if multiple domains are in use •  evaluated •  Used for two phase commits/transaction •  reachable •  Enables branching logic in workflow definitions •  dependencies •  References to other tasks. When they finish, this object gets updated. When a dependencies object is empty, that task is ready to run. •  taskRunnerLease/Heartbeat •  enables atomic checkout if multiple task runners are running. Enables recovery from failed task runners.
  5. 5 © Copyright 2015 EMC Corporation. All rights reserved. Implementation

    details Example workflow task dependencies /* 1 */ { "taskId" : "9ba3a261-4eb7-40c3-919d-c4e07607bc5c", "graphId" : "8fbabec5-a62e-4e90-901c-5ff353388d8f", "state" : "pending", "dependencies" : {}, "terminalOnStates" : [ "cancelled", "failed”, “timeout”, ], "domain" : "default", "evaluated" : false, "reachable" : true, "taskRunnerLease" : null, "taskRunnerHeartbeat" : null, "createdAt" : ISODate("2016-01-27T23:13:30.426Z"), "updatedAt" : ISODate("2016-01-27T23:13:30.426Z"), "_id" : ObjectId("56a94f1a929736c5b6e0cf6a") } /* 2 */ { "taskId" : "fff3e311-19e3-40fb-bd49-24f5667cb51d", "graphId" : "8fbabec5-a62e-4e90-901c-5ff353388d8f", "state" : "pending", "dependencies" : { "9ba3a261-4eb7-40c3-919d-c4e07607bc5c" : ”succeeded" }, "terminalOnStates" : [ "cancelled", "failed”, “timeout”, “succeeded” ], "domain" : "default", "evaluated" : false, "reachable" : true, "taskRunnerLease" : null, "taskRunnerHeartbeat" : null, "createdAt" : ISODate("2016-01-27T23:13:30.430Z"), "updatedAt" : ISODate("2016-01-27T23:13:30.431Z"), "_id" : ObjectId("56a94f1a929736c5b6e0cf6b") } { "friendlyName": "noop-graph", "injectableName": "Graph.noop-test", "options": {}, "tasks": [ { "label": "noop-1", "taskName": "Task.noop" }, { "label": "noop-2", "taskName": "Task.noop", "waitOn": { "noop-1": ”succeeded" } } ] } Graph definition Task dependency documents
  6. 6 © Copyright 2015 EMC Corporation. All rights reserved. Implementation

    details •  Lifecycle of a taskdependency document (some fields hidden) {! "state" : "pending",! "dependencies" : { },! "domain" : "default",! "evaluated" : false,! "reachable" : true, " "taskRunnerLease" : 3e80b1ef-d10f-41ca-95e5-13fa5920aaf5”," "taskRunnerHeartbeat" : ISODate("2016-01-27T23:00:25.991Z")" ! }! {! "state" : "pending",! "dependencies" : { }," "domain" : "default",! "evaluated" : false,! "reachable" : true,! "taskRunnerLease" : null,! "taskRunnerHeartbeat" : null! }! {! "state" : “succeeded"," "dependencies" : { },! "domain" : "default",! "evaluated" : true," "reachable" : true,! "taskRunnerLease" : 3e80b1ef-d10f-41ca-95e5-13fa5920aaf5”,! "taskRunnerHeartbeat" : ISODate("2016-01-27T23:00:37.876Z")" }! {! "state" : "pending",! "dependencies" : {! "afba1db1-07aa-4ffe-814a-84dbcc02bf9b" : ”succeeded"! },! "domain" : "default",! "evaluated" : false,! "reachable" : true,! "taskRunnerLease" : null,! "taskRunnerHeartbeat" : null! }! 1.  A workflow is run. The task document is created. 2.  The task with id “afba1db1-07aa-4ffe-814a-84dbcc02bf9b” finishes with state “succeeded” and the dependencies object is updated accordingly. 3.  The task is checked out, heartbeated, and run by a task runner. 4.  The task completes with state “succeeded” and is then evaluated by the task scheduler (updating the dependencies objects for other task documents, etc.). The combination of a finished and evaluated states means it will be picked up for background deletion.
  7. 7 © Copyright 2015 EMC Corporation. All rights reserved. Implementation

    details lib/services/workflow-api-service.js Code/project structure on-taskgraph on-http on-core on-tasks lib/workflow/stores/mongo.js lib/workflow/messengers/messenger-AMQP.js lib/workflow/task-graph.js (moved/refactored from on-taskgraph to expose to on-http) lib/task.js index.js (exposes base task library, deprecating soon) lib/task-scheduler.js lib/lease-expiration-poller.js lib/completed-task-poller.js lib/task-runner.js •  on-http/lib/services/workflow-api-service.js •  Handles creating and persisting task graph objects from workflow API requests •  on-taskgraph/lib/task-scheduler.js •  Evaluates graph state and schedules new tasks •  on-taskgraph/lib/lease-expiration-poller.js •  Expires task leases from failed task runners to be picked up by the scheduler •  on-taskgraph/lib/completed-task-poller.js •  Deletes finished task dependency documents from the database. Also queues evaluation for graphs in scheduler failure cases. •  on-taskgraph/lib/task-runner.js •  Receives task run events, loads the task and runs its job code. •  on-core/lib/workflow/stores/*.js •  Database interfaces for graph logic •  on-core/lib/workflow/messengers/*.js •  Messenging interfaces for graph events •  on-core/lib/workflow/task-graph.js •  TaskGraph creation, validation, and persistence code
  8. 8 © Copyright 2015 EMC Corporation. All rights reserved. Implementation

    details task-runner lease-expiration-poller completed-task-poller task-scheduler on-taskgraph repository higher level architecture SCHEDULER MODE TASK RUNNER MODE MONGO AMQP
  9. 9 © Copyright 2015 EMC Corporation. All rights reserved. Implementation

    details POST /api/current/workflows/active /api/current/nodes/<id>/workflows on-http repository higher level architecture Run new workflow: MONGO AMQP 1.  Generate a uuid (TaskGraph identifier) 2.  Create new TaskGraph object with it 3.  Validate the object (happens during creation) 4.  Persist the graph object to mongo.pxe.graphobjects 5.  Persist task objects to mongo.pxe.taskdependencies 6.  Publish an event to the Task Scheduler to evaluate the graph 7.  Return the TaskGraph uuid to the client 8.  If the Task Scheduler is down or crashes, it will pick it up out of the database. Route logic:
  10. 10 © Copyright 2015 EMC Corporation. All rights reserved. Implementation

    details evaluateTaskStream task-scheduler stream architecture evaluateGraphStream Update task dependencies (createUpdateTaskDependenciesSubscription()) Find all pending tasks within graph that have an empty dependencies object (findReadyTasks()) Check Graph finished (createCheckGraphFinishedSubscription()) updateTaskDependencies() THEN handleEvaluatedTask(): checkGraphFinishedStream If (state is terminal) If (state is NOT terminal) If (task has terminal failed state) failGraph() Else Check if graph is succeeded, and complete it if so (checkGraphSucceeded()) Schedule ready tasks (handleScheduleTaskEvent()) unevaluatedTaskPoller evaluatedTaskPoller AMQP events
  11. 11 © Copyright 2015 EMC Corporation. All rights reserved. Implementation

    details evaluateTaskStream task-scheduler stream architecture evaluateGraphStream Update task dependencies (createUpdateTaskDependenciesSubscription()) Find all pending tasks within graph that have an empty dependencies object (findReadyTasks()) Check Graph finished (createCheckGraphFinishedSubscription()) updateTaskDependencies() THEN handleEvaluatedTask(): checkGraphFinishedStream If (state is terminal) If (state is NOT terminal) If (task has terminal failed state) failGraph() Else Check if graph is succeeded, and complete it if so (checkGraphSucceeded()) Schedule ready tasks (handleScheduleTaskEvent()) AMQP MONGO unevaluatedTaskPoller evaluatedTaskPoller AMQP events
  12. 12 © Copyright 2015 EMC Corporation. All rights reserved. Implementation

    details runTaskStream task-runner stream architecture cancelTaskStream checkoutTask() (atomic db) Task.cancel() Update every task document the runner owns heartbeat (interval) If (success) Get task definition (getTaskById()) Run task (Task.run()) Publish when task is finished AMQP events AMQP MONGO
  13. 13 © Copyright 2015 EMC Corporation. All rights reserved. Implementation

    details Very brief intro to Rx.js style var array = [1,2,3,4,5];! var values = array.map(function(item) {! var value = item * 10;! return value;! });! console.log(values);! ! // [10, 20, 30, 40, 50]! var Rx = require('rx');! ! var emitter = new Rx.Subject();! ! emitter! .map(function(item) {! var value = item * 10;! return value;! })! .subscribe(function(value) {! console.log(value);! });! ! // like emit(value) in EventEmitter ! // vocabulary! emitter.onNext(1);! emitter.onNext(2);! emitter.onNext(3);! emitter.onNext(4);! emitter.onNext(5);! ! // 10! // 20! // 30! // 40! // 50! Rx: functional style ! with streams! Simple functional style! var Rx = require('rx');! ! var taskEmitter = amqpRunTaskSubscription;! ! taskEmitter! .map(function(task) {! return mongo.getTaskDocument(task.id);! })! .map(function(task) {! return self.scheduleTask(task);! })! .subscribe(function(task) {! console.log('Scheduled task ' + task.id);! });! Some pseudo-code for real world! circumstances!