open source
Doing it by yourself
All alone, in the dark
Shlomi Noach
Slide 2
Slide 2 text
About me
● Engineer, DBA
● Blog at http://openark.org
● MySQL community member
● Author of:
○ openark-kit
○ mycheckpoint (discontinued)
○ common_schema
○ Outbrain Propagator
○ Outbrain Orchestrator
● Occasional code contributor to MySQL -
related projects. Bug reporter.
Slide 3
Slide 3 text
openark-kit
● Set of utilities for MySQL
● A collection of python scripts which
complement or enhance MySQL feature set
● Modeled after Maatkit (predecessor of
Percona Toolkit)
● Notable tools: oak-online-alter-table, oak-
chunk-update, oak-hook-general-log, oak-
security-audit
● Announced 2009
● To date: 2K-3K downloads per release
Slide 4
Slide 4 text
How you perceive your work
Slide 5
Slide 5 text
How others may perceive it
Slide 6
Slide 6 text
Case study: oak-online-alter-table
● Created and published early 2009
● Revolutionary, a real game changer
● Allows for online, non-blocking, throttleable
table refactoring.
● For some 18 months, it seemed like I was its
only production user
Slide 7
Slide 7 text
I’m TELLING YOU, I’ve created a
BEAST! You should be USING IT!
Slide 8
Slide 8 text
Case study: oak-online-alter-table
● About a year later, Facebook engineers
called out for comments on their idea for
online-schema-change.
● After presenting oak-online-alter-table and
some quick discussions, they adopted the
general approach by openark-kit.
● FB eventually forked and rewrote in PHP.
● They say it revolutionized their deployment
cycles.
● Released as Facebook’s OSC
Slide 9
Slide 9 text
No content
Slide 10
Slide 10 text
No content
Slide 11
Slide 11 text
No content
Slide 12
Slide 12 text
No content
Slide 13
Slide 13 text
No content
Slide 14
Slide 14 text
Case study: oak-online-alter-table
Your name matters. You big company name
really matters.
Slide 15
Slide 15 text
common_schema
● DBA’s framework for MySQL
● A set of routines, views & scripting
interpreter, completely residing within the
database server
● Distribution is a SQL file
● Introduces QueryScript: SQL oriented
scripting language
○ QueryScript has the language constructs to support
your common-yet -very-complex operations
○ It is MySQL-aware, and mitigates locks and pitfalls
commonly found in existing solutions
Slide 16
Slide 16 text
common_schema
● Introduced in 2011.
● Released under GPLv2
● > 2.5K downloads (some 0.1k are mine :P)
● Mature
● Reviewed in High Performance MySQL 3rd
Edition: "The common_schema is to MySQL
as jQuery is to javaScript"
Slide 17
Slide 17 text
Stuff that common_schema can do
for you
● Auto analyze and rotate your partitions
● Find and eliminate redundant indexes
● Audit your security setup
● Find and destroy locking/locked transactions
● Find and destroy blocking/idle transactions
● Much much more...
Slide 18
Slide 18 text
QueryScript code samples
foreach($table, $schema, $engine:
table in sakila) {
if ($engine = 'InnoDB')
ALTER TABLE :$schema.:$table
ENGINE=InnoDB ROW_FORMAT=Compressed;
}
Propagator & Orchestrator
Releasing Outbrain’s internal tools
● Designed as open source from day #1.
● Fine balance: comply with company’s needs
or generalize it for the greater audience?
● Good to be public on the above. People like
to know where the product is headed. They
mostly like to know whether feature X will be
supported. They like to know if X will not be
supported.
Slide 21
Slide 21 text
Propagator
schema & data deployment tool that works on a
multi-everything topology:
● Multi-server: push your schema & data changes to
multiple instances in parallel
● Multi-role: different servers have different schemas
● Multi-environment: recognizes the differences between
development, QA, build & production servers
● Multi-technology: supports MySQL, Hive (Cassandra on
the TODO list)
● Multi-user: allows users authenticated and audited
access
● Multi-planetary: TODO
Orchestrator
MySQL replication topology management and
visualization tool
● First of its kind
● Visualize topologies, see what’s wrong
● Refactor topologies; move slave around
safely & visually
● Supports command line, web API and web
interface actions
Releasing under Outbrain’s brand
● Sure helps.
● People like to know a large company uses a
product on production.
● Orchestrator got a lot of attention and
positive feedback.
Slide 26
Slide 26 text
The 3rd most time consuming task
in OSS development
(Proportional to the size of the task)
Documentation!
● People actually do read it. Make it as
comprehensive as possible.
● common_schema boasts some > 150 “man
pages”. Documenting took by far more time
resource than coding.
● Proper documentation is the only thing that
grants you the right to say: “RTFM!”
Slide 27
Slide 27 text
The 2nd most time consuming task
in OSS development
(Proportional to the size of the task)
Pick a License!
● Consider Apache, GPL, BSD, MIT, WTFPL
among others.
● Use https://tldrlegal.com/ for human friendly
license breakdown