Upgrade to Pro — share decks privately, control downloads, hide ads and more …

RackHD Debugging and Isolation

RackHD Debugging and Isolation

Overview of RackHD for the purposes of assisting with debugging and isolation of unexpected results...

Cb09f569d2572bd00baec8da68a88201?s=128

Joseph Heck

April 26, 2016
Tweet

More Decks by Joseph Heck

Other Decks in Technology

Transcript

  1. RackHD Debugging and Isola3on

  2. Taking Data to Informa3on •  Analysis is cri3cal to resolving

    defects •  So=ware QA can go much deeper – All the code is available to inves3gate – Architecture is open – Responsibili3es for components well defined •  A roadmap to learn never hurts… And that is what this deck includes...
  3. Isola3on •  Break down the problem into smaller parts • 

    Look for the natural boundaries in the system •  Implies you need to know how it is built, how it interacts – Reasoning about components •  But what if you don’t know? – Or can’t remember?
  4. Scien3fic Method •  How to learn what it really does

    –  The docs lie, programmers make mistakes, and there’s always unintended consequences •  Scien3fic Method Process –  Ask a ques3on –  Do the background research –  Construct a hypothesis –  Test you hypothesis –  Analyze your data, draw a conclusion –  Communicate your results
  5. Exis3ng tests •  Integra3on Tests –  hRps://github.com/RackHD/RackHD/tree/master/test •  Unit tests

    for smaller so=ware components –  hRps://coveralls.io/github/RackHD/on-core –  hRps://coveralls.io/github/RackHD/on-dhcp-proxy –  hRps://coveralls.io/github/RackHD/on-hRp –  hRps://coveralls.io/github/RackHD/on-syslog –  hRps://coveralls.io/github/RackHD/on-taskgraph –  hRps://coveralls.io/github/RackHD/on-tasks –  hRps://coveralls.io/github/RackHD/on-Wtp •  hRp://rackhd.readthedocs.org/en/latest/repositories.html#repositories-status
  6. The Background Research •  RackHD logical architecture •  Core Concepts

    – PXE Boo3ng – how it works – Workflow Engine Interac3ons – PXE boo3ng a Microkernel to extend reach – Profiles and Templates •  Configura3on and Logs
  7. Process/Communica3ons Architecture hRp://rackhd.readthedocs.org/en/latest/so=ware_architecture.html#major-components

  8. Logical Architecture on-syslog on-Wtp ISC dhcp on-dhcp-proxy on-hRp rabbitmq mongodb

    on-taskgraph SNMP IPMI AMT REDFISH … clients Incoming Events Workflow Engine Outgoing Ac3ons
  9. Configura3on and Logs on-* process hRps://github.com/RackHD/on-core/blob/master/lib/services/configura3on.js hRps://github.com/RackHD/on-core/blob/fea46c/lib/common/messenger.js#L85 Configura3on by key

    – Order of Precedence: •  Command-line argument •  Environment Variable •  Configura3on File •  /opt/monorail/config.json •  /opt/onrack/etc/monorail.json •  In-code defaults •  uri = configura3on.get(‘amqp’, ‘amqp://localhost’) •  Distributed in code where used/needed stdout upstart docker systemD /var/log/upstart/on-* docker logs {docker_id} journctl
  10. Basic DHCP, no Proxy from hRp://download.intel.com/design/archives/wfm/downloads/pxespec.pdf

  11. DHCP w/ local Proxy from hRp://download.intel.com/design/archives/wfm/downloads/pxespec.pdf

  12. DHCP w/ remote Proxy from hRp://download.intel.com/design/archives/wfm/downloads/pxespec.pdf

  13. PXE (addi3onal reading) RackHD overview descrip3on •  hRp://rackhd.readthedocs.org/en/latest/how_it_works.html PXE: what

    it is, how it works •  hRps://en.m.wikipedia.org/wiki/Preboot_Execu3on_Environment The PXE Spec: •  hRp://download.intel.com/design/archives/wfm/downloads/pxespec.pdf DHCP •  hRps://en.wikipedia.org/wiki/Dynamic_Host_Configura3on_Protocol DHCP Proxy •  hRp://www.juniper.net/documenta3on/en_US/junos13.3/topics/concept/dhcp- extended-dhcp-relay-proxy-overview.html
  14. iPXE follow on iPXE request for a script “profiles” API

    Response: •  Don’t know the node: Discover it •  Known node, no Workflow: No-op or default response •  Known node, workflow: response from workflow client system RackHD hRp://rackhd.readthedocs.org/en/latest/devguide/index.html#rackhd-debugging-guide
  15. What is a workflow Graph Task Task Job Job Task

    Task Job Job Task Job Graph •  JSON document •  Describes flow of execu3on •  Wrapper for Shared op3ons and context values Task •  JSON Data only •  1:1 ra3o of tasks to jobs •  Can have 0-n tasks as run dependencies in a graph •  Target nodes or arbitrary code execu3on Job •  NodeJS code backing the Task declara3on •  Simply a class with a run func3on •  Configura3on comes from Task JSON
  16. Task Flow Example w/ Failure Handling Task-B Success Task-D Task-C

    Task-A succeeded finished failed
  17. Workflow Tasks •  Run commands, tooling –  IPMI –  SNMP

    –  RACADM •  Interact with RackHD data –  Read catalog data –  Set catalog values, node values •  Provide responses for PXE –  DHCP –  TFTP –  HTTP
  18. Profiles and Templates workflow HTTP GET /api/1.1/profiles GET /api/1.1/templates/{id} hRps://github.com/RackHD/on-hRp/blob/master/lib/api/1.1/southbound/profiles.js

    hRps://github.com/RackHD/on-hRp/blob/master/lib/api/1.1/southbound/templates.js iPXE bootloader Any “southward” ini3ated HTTP request Profile == iPXE Script Template == Any generalized template hRps://github.com/RackHD/on-hRp/tree/master/data/profiles hRps://github.com/RackHD/on-hRp/tree/master/data/templates Rendered as EJS template with context from ac3ve workflow related to the node
  19. remote host microkernel Microkernel Tasks workflow task runner HTTP Job.Linux.Commands

    GET /api/1.1/tasks/bootstrap.js (1) (2) start task runner (3) GET /api/1.1/tasks/{id} (4) (5) POST/api/1.1/tasks/{id} hRps://github.com/RackHD/on-hRp/blob/master/lib/api/1.1/southbound/tasks.js Job.Linux.Bootstrap Job.WinPE.Bootstrap (0)