Thank you to Jonathan, Duncan, Jonathan,
Marije, Jennifer, Lessfuss, and Cape Town.
ROCKS!
Friday, January 27, 12
Slide 2
Slide 2 text
basho
@pharkmillups
themarkphillips.com
[email protected]
Mark Phillips
Friday, January 27, 12
Slide 3
Slide 3 text
Building Healthy
Distributed Systems
ScaleConf
January 27, 2012
basho
Friday, January 27, 12
Slide 4
Slide 4 text
“A distributed system consists of multiple
autonomous computers that communicate
through a computer network.
The computers interact with each other in order to
achieve a common goal.” [1]
What is a distributed system?
basho
Friday, January 27, 12
Slide 5
Slide 5 text
Distributed, Scalable, Fault Tolerant
No central coordinator;
Easy to setup and operate
basho
Friday, January 27, 12
Slide 6
Slide 6 text
Distributed, Scalable, Fault Tolerant
Horizontally Scalable;
Add commodity hardware to get more
[throughput | processing | storage].
basho
Friday, January 27, 12
Slide 7
Slide 7 text
Distributed, Scalable, Fault Tolerant
Always Available
No Single Point of Failure
Self-healing
basho
Friday, January 27, 12
Slide 8
Slide 8 text
basho
Friday, January 27, 12
Slide 9
Slide 9 text
basho
{ • Founded in 2007
• Collapsed in 2008
• “Pivoted” in 2009
• Commercial Sponsors of Riak, an
Open Source, NoSQL Database
• Sells Closed Source Extensions to
Riak in the form licenses
Friday, January 27, 12
Slide 10
Slide 10 text
2009
2010
2011
14
60
25
Year on Year Growth
basho
Friday, January 27, 12
Slide 11
Slide 11 text
basho Office Locations
Friday, January 27, 12
Slide 12
Slide 12 text
Actual Employee Distribution
basho
Friday, January 27, 12
Slide 13
Slide 13 text
“A distributed [company] consists of multiple
autonomous [team members] that communicate
[and collaborate] through various [channels].
The [team members] interact with each other in
order to achieve a common goal.”
What is a distributed [company]?
basho
Friday, January 27, 12
Slide 14
Slide 14 text
Hiring where the talent is means
we don’t sacrifice great hires for
location, but it also presents
various hurdles when attempting
to build culture and community.
basho
basho
Friday, January 27, 12
Slide 15
Slide 15 text
1. Make Basho into a Powerhouse
2. Professional Development
3. Employee Happiness
4. Deliver Exceptional Product
Common Goals
for Basho
basho
Friday, January 27, 12
Slide 16
Slide 16 text
Internal Communication
and Collaboration
• Real-time Chat (Jabber, Camp Fire)
• Skype (or some for of video chat)
• Yammer
• GitHub
• AgileZen
• Email (sort of)
• Documentation
basho
Friday, January 27, 12
Slide 17
Slide 17 text
Good Meetings
basho
• Quarterly In-person “Summits”
• Bi-Monthly, Non-Mandatory Company All Hands
• Stands up, Scrum
Friday, January 27, 12
Slide 18
Slide 18 text
Make Documentation Part
of Your Culture
basho
• Inside Jokes
• Internal Talks
• Design Documents
• Product Ideas
• Product Feedback
• New Hire Processes
• Everything Else
Friday, January 27, 12
Slide 19
Slide 19 text
Open Source Your Code.
And Use GitHub.
basho
• Contributes Directly to Developer Happiness
• Makes Your Company’s Product Better
• Great Marketing
• Use a Permissive License
(http://bit.ly/clJyDO)
(http://bit.ly/v3OMEf)
“Open Source Almost Everything”
“Why Your Company Should
Have a Permissive Open Source
Policy”
Friday, January 27, 12
Slide 20
Slide 20 text
Friday, January 27, 12
Slide 21
Slide 21 text
basho
basho
Hiring Should Not Happen
In A Vacuum
Friday, January 27, 12
Slide 22
Slide 22 text
Poor Culture Rots a
Company from within and
Lessens its Resiliency
basho
Friday, January 27, 12
Slide 23
Slide 23 text
basho Company Fault Tolerance
• New CEO + Massive Growth = New Challenges
• Our System is Constantly Improving
Friday, January 27, 12
Slide 24
Slide 24 text
2012
1**
Planned Growth
basho
Friday, January 27, 12
Slide 25
Slide 25 text
basho
DS2:
The Riak Community
Friday, January 27, 12
Slide 26
Slide 26 text
“A distributed [community] consists of
multiple autonomous [members] that communicate
[and collaborate] through various [channels].
The [members] interact with each other in order to
achieve a common goal.”
What is a distributed [community]?
basho
Friday, January 27, 12
Slide 27
Slide 27 text
Community
Friday, January 27, 12
Slide 28
Slide 28 text
basho
Why Build A
Community?
Friday, January 27, 12
Slide 29
Slide 29 text
Grassroots Marketing,
Branding, Awareness:
basho
Friday, January 27, 12
Slide 30
Slide 30 text
Code Contributions
and Bug Fixes :
basho
176
names in our
THANKS file
1600
hours contributed from
Oct 2010 - Sept 2011
Friday, January 27, 12
Slide 31
Slide 31 text
Support:
basho
Friday, January 27, 12
Slide 32
Slide 32 text
Revenue:
basho
75%
of new customers in
2011 came from the
Open Source Community
Friday, January 27, 12
Slide 33
Slide 33 text
Importance of Community
for Community Members
basho
•Working, Quality Code
•Recognition and Praise
•Desire to Contribute
•Jobs (whether they like it or not)
•Skills Acquisition
Friday, January 27, 12
Slide 34
Slide 34 text
Communication and Collaboration
in a Distributed [Community]
basho
•IRC
•Mailing List
•Twitter
•Riak Recap
•Meetups
•Q & A Sites
•Blogs
•Books
•Conferences
•Actual Meetings
•GitHub
•Drinking
Friday, January 27, 12
Slide 35
Slide 35 text
Riak Recap
basho
Friday, January 27, 12
Slide 36
Slide 36 text
Books
basho
http://riakhandbook.com/
Friday, January 27, 12
Slide 37
Slide 37 text
Meetups and Drinking
basho
Friday, January 27, 12
Slide 38
Slide 38 text
GitHub
basho
Friday, January 27, 12
Slide 39
Slide 39 text
Give Things Away
basho
Friday, January 27, 12
Slide 40
Slide 40 text
Build Communities Regardless
basho
Friday, January 27, 12
Slide 41
Slide 41 text
basho Community Fault Tolerance
Friday, January 27, 12
Slide 42
Slide 42 text
DS3:
Riak-based Distributed System
basho
Friday, January 27, 12
Slide 43
Slide 43 text
“A distributed system consists of multiple
autonomous computers that communicate
through a computer network.
The computers interact with each other in order to
achieve a common goal.”
What is a distributed system?
basho
Friday, January 27, 12
Slide 44
Slide 44 text
• a database
• a key/value store
• distributed
• fault-tolerant
• scalable
• Dynamo-inspired
• used by startups
• used by FORTUNE 100 companies
• written (primarily) in Erlang
• pronounced “REE-awk”
• not the right fit for every project and app
basho
{
Friday, January 27, 12
Slide 45
Slide 45 text
1000s of Deployments
Friday, January 27, 12
Slide 46
Slide 46 text
basho
Friday, January 27, 12
Slide 47
Slide 47 text
basho Common Goals
for Voxer’s System
1. Serve and Receive App Traffic
2. Perform Queries When Needed
3. Don’t Go Down
4. Scale Out to Meet Demand
5. Low, Consistent Response Times
Friday, January 27, 12
Slide 48
Slide 48 text
Voxer’s Initial Riak Cluster
Stats (Oct 2011)
•11 Riak Nodes
•Modest Data Set Size (100s of Gs)
•~20,000 Peak Concurrent Users
•~4,000,000 Daily Total Requests
Then something happened...
basho
Friday, January 27, 12
Slide 49
Slide 49 text
Friday, January 27, 12
Slide 50
Slide 50 text
basho
Friday, January 27, 12
Slide 51
Slide 51 text
Voxer’s Current Riak
Cluster Stats
• 41 Node Cluster for User Data
• 37 Node Cluster to serve app traffic
• ~350G/day of user data being added daily
• 100,000s of concurrent users at peak
• Went from 11 to about 80 nodes in a month
• At one point adding three nodes/day
basho
Friday, January 27, 12
Slide 52
Slide 52 text
basho Voxer’s Fault Tolerance
• Have lost a lot of nodes in production
• TCP Incast Problem [2]
• LevelDB merge issues
• Lots of other shit went wrong
but it’s still running :)
Friday, January 27, 12
Slide 53
Slide 53 text
“Scalability is the ability of a system, network,
or process, to handle growing amount of work
in a capable manner or its ability to be enlarged to
accommodate that growth.”[3]
basho
Friday, January 27, 12
Slide 54
Slide 54 text
Present System Health Dictates
Future Ability to Scale
basho
Friday, January 27, 12
Slide 55
Slide 55 text
credit: http://blogs.ajc.com/jeff-schultz-blog/files/2009/06/closedsign.png
basho
Distributed
[ Companies | Communities | Systems ]
are all susceptible to downtime.
Friday, January 27, 12
Slide 56
Slide 56 text
Capacity Plan
or Perish
basho
Friday, January 27, 12
Slide 57
Slide 57 text
Everything Is
Distributed Now
basho
Friday, January 27, 12
Slide 58
Slide 58 text
basho
@pharkmillups
themarkphillips.com
[email protected]
Mark Phillips
Questions?
Friday, January 27, 12