Lean Enterprise with Microservices and Big Data

How to enable the Lean Enterprise Johann Romefort co-founder @
rainbow

My Background • Seesmic - Co-founder & CTO Video conversation
platform Social media clients…lots of pivots :) • Rainbow - Co-founder & CTO Enterprise App Store

Goal of this presentation • Understand what is the Lean
Enterprise, how it relates to big data and the software architecture you build • Have a basic understanding of the technologies and tools involved

What is the Lean Enterprise? http://en.wikipedia.org/wiki/Lean_enterprise “Lean enterprise is a
practice focused on value creation for the end customer with minimal waste and processes.”

Enabling the OODA Loop ! ! “Get inside your adversaries'
OODA loop to disorient them” ! OBSERVE ORIENT DECIDE ACT USAF Colonel John Boyd on Combat: OODA Loop

Enabling the OODA Loop OODA Loop

The OODA Loop for software image credit: Adrian Cockcroft

OODA Loop • (Observe) Innovation and (Decide) Culture are mainly
human-based • Orient (BigData) and Act (Cloud) can be automated

ORIENT

What is Big Data? • It’s data at the intersection
of 3 V: • Velocity (Batch / Real time / Streaming) • Volume (Terabytes/Petabytes) • Variety (structure/semi-structured/unstructured)

Why is everybody talking about it? • Cost of generation
of data has gone down • By 2015, 3B people will be online, pushing data volume created to 8 zettabytes • More data = More insights = Better decisions • Ease and cost of processing is falling thanks to cloud platforms

Data ﬂow and constraints Generate Ingest / Store Process Visualize
/ Share The 3 V involve heterogeneity and make it hard to achieve those steps

What is AWS? • AWS is a cloud computing platform
• On-demand delivery of IT resources • Pay-as-you-go pricing model

Cloud Computing + + Storage Compute Networking Adapts dynamically to
ever changing needs to stick closely to user infrastructure and applications requirements

How does AWS helps with Big Data? • Remove constraints
on the ingesting, storing, and processing layer and adapts closely to demands. • Provides a collection of integrated tools to adapt to the 3 V’s of Big Data  • Unlimited capacity of storage and processing power ﬁts well to changing data storage and analysis requirements.

Computing Solutions for Big Data on AWS Kinesis EC2 EMR
Redshift

Computing Solutions for Big Data on AWS EC2 All-purpose computing
instances. Dynamic Provisioning and resizing Let you scale your infrastructure at low cost Use Case: Well suited for running custom or proprietary application (ex: SAP Hana, Tableau…)

Computing Solutions for Big Data on AWS EMR ‘Hadoop in
the cloud’ Adapt to complexity of the analysis and volume of data to process Use Case: Oﬄine processing of very large volume of data, possibly unstructured (Variety variable)

Computing Solutions for Big Data on AWS Kinesis Stream Processing
Real-time data Scale to adapt to the ﬂow of inbound data Use Case: Complex Event Processing, click streams, sensors data, computation over window of time

Computing Solutions for Big Data on AWS RedShift Data Warehouse
in the cloud Scales to Petabytes Supports SQL Querying Start small for just $0.25/h Use Case: BI Analysis, Use of ODBC/JDBC legacy software to analyze or visualize data

Storage Solution for Big Data on AWS DynamoDB RedShift S3
Glacier

Storage Solution for Big Data on AWS DynamoDB NoSQL Database
Consistent Low latency access Column-base ﬂexible data model Use Case: Oﬄine processing of very large volume of data, possibly unstructured (Variety variable)

Storage Solution for Big Data on AWS S3 Use Case:
Backups and Disaster recovery, Media storage, Storage for data analysis Versatile storage system Low-cost Fast retrieving of data

Storage Solution for Big Data on AWS Glacier Use Case:
Storing raw logs of data. Storing media archives. Magnetic tape replacement Archive storage of cold data Extremely low-cost optimized for data infrequently accessed

What makes AWS diﬀerent when it comes to big data?

Given the 3V’s a collection of tools is most of
the time needed for your data processing and storage. Integrated Environment for Big Data AWS Big Data solutions comes integrated with each others already AWS Big Data solutions also integrate with the whole AWS ecosystem (Security, Identity Management, Logging, Backups, Management Console…)

Example of products interacting with each other.

Tightly integrated rich environment of tools On-demand scaling sticking to
processing requirements + = Extremely cost-eﬀective and easy to deploy solution for big data needs

• Error Detection: Real-time detection of hardware problems • Optimization
and Energy management Use Case: Real-time IOT Analytics Gathering data in real time from sensors deployed in factory and send them for immediate processing

First Version of the infrastructure Aggregate Sensors data nodejs stream
processor On customer site evaluate rules over time window in-house hadoop cluster mongodb feed algorithm write raw data for further processing backup

Version of the infrastructure ported to AWS Aggregate Sensors data
On customer site evaluate rules over time window write raw data for archiving Kinesis RedShift for BI analysis Glacier

Cloud and Lean Enterprise

Let’s start with a personal example

First year @seesmic • Prototype becomes production • Monolithic architecture
• No analytics/metrics • Little monitoring • Little automated testing

I built a monolith

or…at least I tried

Early days at Seesmic First year @seesmic

Everybody loves a good horror story

We crashed Techcrunch

What did we do?

Add a QA Manager

Add bearded SysAdmin

We added tons of process so nothing can’t go wrong

Impact on dev team • Frustration of slow release process
• Lots of back and forth due to bugs and the necessity to test app all over each time • Chain of command too long • Feeling no power in the process • Low trust

Impact on product team • Frustration of not executing fast
enough • Frustration of having to ask for everything (like metrics) • Feeling engineers always have the last word

Impact on Management

• Break down software into smaller autonomous units • Break
down teams into smaller autonomous units • Automating and tooling, CI / CD • Plan for the worst What can you do?

= Break down software into smaller autonomous units

Introduction to Microservices

Monolith vs Microservices - 10000ft view -

Monolith vs Microservices - databases -

Monolith vs Microservices - servers -

Microservices - example -

Break down team into smaller units

Amazon’s “two-pizza teams” • 6 to 10 people; you can
feed them with two pizzas. • It’s not about size, but about accountability and autonomy • Each team has its own ﬁtness function

• Full devops model: good tooling needed • Still need
to be designed for resiliency • Harder to test Friction points

Continuous Integration (CI) is the practice, in software engineering, of
merging all developer working copies with a shared mainline several times a day

Continuous Deployment

Tools for Continuous Integration • Jenkins (Open Source, Lot of
plugins, hard to conﬁgure) • Travis CI (Look better, less plugins)

Tools for Continuous Deployment • GO.cd (Open-Source) • shippable.com (SaaS,
Docker support) • Code Deploy (AWS) + Puppet, Chef, Ansible, Salt, Docker…

Impact on dev • Autonomy • Not afraid to try
new things • More conﬁdent in codebase • Don’t have to linger around with old bugs until there’s a release

Impact on product team • Iterate faster on features •
Can make, bake and break hypothesis faster • Product gets improved incrementally everyday

Impact on Management

• Enabling Microservices architecture • Enabling better testing • Enabling
devops model • Come talk to the Docker team tomorrow!

Thank You follow me: @romefort [email protected]

Lean Enterprise with Microservices and Big Data

Lean Enterprise with Microservices and Big Data

More Decks by Johann Romefort

Other Decks in Technology

Featured

Transcript