Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Lean Enterprise with Microservices and Big Data

Lean Enterprise with Microservices and Big Data


Johann Romefort

December 10, 2014


  1. How to enable the Lean Enterprise Johann Romefort co-founder @

  2. My Background • Seesmic - Co-founder & CTO Video conversation

    platform Social media clients…lots of pivots :) • Rainbow - Co-founder & CTO Enterprise App Store
  3. Goal of this presentation • Understand what is the Lean

    Enterprise, how it relates to big data and the software architecture you build • Have a basic understanding of the technologies and tools involved
  4. What is the Lean Enterprise? http://en.wikipedia.org/wiki/Lean_enterprise “Lean enterprise is a

    practice focused on value creation for the end customer with minimal waste and processes.”
  5. Enabling the OODA Loop ! ! “Get inside your adversaries'

    OODA loop to disorient them” ! OBSERVE ORIENT DECIDE ACT USAF Colonel John Boyd on Combat: OODA Loop
  6. Enabling the OODA Loop OODA Loop

  7. The OODA Loop for software image credit: Adrian Cockcroft

  8. OODA Loop • (Observe) Innovation and (Decide) Culture are mainly

    human-based • Orient (BigData) and Act (Cloud) can be automated

  10. What is Big Data? • It’s data at the intersection

    of 3 V: • Velocity (Batch / Real time / Streaming) • Volume (Terabytes/Petabytes) • Variety (structure/semi-structured/unstructured)
  11. Why is everybody talking about it? • Cost of generation

    of data has gone down • By 2015, 3B people will be online, pushing data volume created to 8 zettabytes • More data = More insights = Better decisions • Ease and cost of processing is falling thanks to cloud platforms
  12. Data flow and constraints Generate Ingest / Store Process Visualize

    / Share The 3 V involve heterogeneity and make it hard to achieve those steps
  13. What is AWS? • AWS is a cloud computing platform

    • On-demand delivery of IT resources • Pay-as-you-go pricing model
  14. Cloud Computing + + Storage Compute Networking Adapts dynamically to

    ever changing needs to stick closely to user infrastructure and applications requirements
  15. How does AWS helps with Big Data? • Remove constraints

    on the ingesting, storing, and processing layer and adapts closely to demands. • Provides a collection of integrated tools to adapt to the 3 V’s of Big Data
 • Unlimited capacity of storage and processing power fits well to changing data storage and analysis requirements.
  16. Computing Solutions for Big Data on AWS Kinesis EC2 EMR

  17. Computing Solutions for Big Data on AWS EC2 All-purpose computing

    instances. Dynamic Provisioning and resizing Let you scale your infrastructure at low cost Use Case: Well suited for running custom or proprietary application (ex: SAP Hana, Tableau…)
  18. Computing Solutions for Big Data on AWS EMR ‘Hadoop in

    the cloud’ Adapt to complexity of the analysis and volume of data to process Use Case: Offline processing of very large volume of data, possibly unstructured (Variety variable)
  19. Computing Solutions for Big Data on AWS Kinesis Stream Processing

    Real-time data Scale to adapt to the flow of inbound data Use Case: Complex Event Processing, click streams, sensors data, computation over window of time
  20. Computing Solutions for Big Data on AWS RedShift Data Warehouse

    in the cloud Scales to Petabytes Supports SQL Querying Start small for just $0.25/h Use Case: BI Analysis, Use of ODBC/JDBC legacy software to analyze or visualize data
  21. Storage Solution for Big Data on AWS DynamoDB RedShift S3

  22. Storage Solution for Big Data on AWS DynamoDB NoSQL Database

    Consistent Low latency access Column-base flexible data model Use Case: Offline processing of very large volume of data, possibly unstructured (Variety variable)
  23. Storage Solution for Big Data on AWS S3 Use Case:

    Backups and Disaster recovery, Media storage, Storage for data analysis Versatile storage system Low-cost Fast retrieving of data
  24. Storage Solution for Big Data on AWS Glacier Use Case:

    Storing raw logs of data. Storing media archives. Magnetic tape replacement Archive storage of cold data Extremely low-cost optimized for data infrequently accessed
  25. What makes AWS different when it comes to big data?

  26. Given the 3V’s a collection of tools is most of

    the time needed for your data processing and storage. Integrated Environment for Big Data AWS Big Data solutions comes integrated with each others already AWS Big Data solutions also integrate with the whole AWS ecosystem (Security, Identity Management, Logging, Backups, Management Console…)
  27. Example of products interacting with each other.

  28. Tightly integrated rich environment of tools On-demand scaling sticking to

    processing requirements + = Extremely cost-effective and easy to deploy solution for big data needs
  29. • Error Detection: Real-time detection of hardware problems • Optimization

    and Energy management Use Case: Real-time IOT Analytics Gathering data in real time from sensors deployed in factory and send them for immediate processing
  30. First Version of the infrastructure Aggregate Sensors data nodejs stream

    processor On customer site evaluate rules over time window in-house hadoop cluster mongodb feed algorithm write raw data for further processing backup
  31. Version of the infrastructure ported to AWS Aggregate Sensors data

    On customer site evaluate rules over time window write raw data for archiving Kinesis RedShift for BI analysis Glacier
  32. ACT

  33. Cloud and Lean Enterprise

  34. Let’s start with a personal example

  35. None
  36. First year @seesmic • Prototype becomes production • Monolithic architecture

    • No analytics/metrics • Little monitoring • Little automated testing
  37. I built a monolith

  38. or…at least I tried

  39. Early days at Seesmic First year @seesmic

  40. Everybody loves a good horror story

  41. We crashed Techcrunch

  42. None
  43. What did we do?

  44. Add a QA Manager

  45. Add bearded SysAdmin

  46. We added tons of process so nothing can’t go wrong

  47. Impact on dev team • Frustration of slow release process

    • Lots of back and forth due to bugs and the necessity to test app all over each time • Chain of command too long • Feeling no power in the process • Low trust
  48. Impact on product team • Frustration of not executing fast

    enough • Frustration of having to ask for everything (like metrics) • Feeling engineers always have the last word
  49. Impact on Management

  50. • Break down software into smaller autonomous units • Break

    down teams into smaller autonomous units • Automating and tooling, CI / CD • Plan for the worst What can you do?
  51. = Break down software into smaller autonomous units

  52. Introduction to Microservices

  53. Monolith vs Microservices - 10000ft view -

  54. Monolith vs Microservices - databases -

  55. Monolith vs Microservices - servers -

  56. Microservices - example -

  57. Break down team into smaller units

  58. Amazon’s “two-pizza teams” • 6 to 10 people; you can

    feed them with two pizzas. • It’s not about size, but about accountability and autonomy • Each team has its own fitness function
  59. • Full devops model: good tooling needed • Still need

    to be designed for resiliency • Harder to test Friction points
  60. Continuous Integration (CI) is the practice, in software engineering, of

    merging all developer working copies with a shared mainline several times a day
  61. Continuous Deployment

  62. Continuous Deployment

  63. Tools for Continuous Integration • Jenkins (Open Source, Lot of

    plugins, hard to configure) • Travis CI (Look better, less plugins)
  64. Tools for Continuous Deployment • GO.cd (Open-Source) • shippable.com (SaaS,

    Docker support) • Code Deploy (AWS) + Puppet, Chef, Ansible, Salt, Docker…
  65. Impact on dev • Autonomy • Not afraid to try

    new things • More confident in codebase • Don’t have to linger around with old bugs until there’s a release
  66. Impact on product team • Iterate faster on features •

    Can make, bake and break hypothesis faster • Product gets improved incrementally everyday
  67. Impact on Management

  68. • Enabling Microservices architecture • Enabling better testing • Enabling

    devops model • Come talk to the Docker team tomorrow!
  69. Thank You follow me: @romefort romefort@gmail.com