Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
lpw-2012
Search
Oleg Komarov
November 24, 2012
Programming
1
300
lpw-2012
Reliable Cron Jobs in Distributed Environment
Oleg Komarov
November 24, 2012
Tweet
Share
More Decks by Oleg Komarov
See All by Oleg Komarov
yapc::eu 2013
komarov
0
150
Exploring Plack Middlewares
komarov
0
130
yapc_eu_2012
komarov
2
390
Other Decks in Programming
See All in Programming
オブジェクト指向のリ・オリエンテーション~歴史を振り返り、AI時代に向きなおる~
hanyudaeiiti
2
250
イベントストーミングによるオブジェクトモデリング・オブジェクト指向プログラミングの適用・開発プロセスの変遷・アーキテクチャの変革 / Object modeling with Event Storming.
nrslib
12
3.2k
document.write再考
brn
5
2.5k
Migrating to Signals: A Practical Workshop
manfredsteyer
PRO
0
290
Creating Retro-Style Photos Using Swift
ski
1
370
Open Source Swift Workshop - Foundation and first party libraries
ikesyo
0
270
GitHub Copilot Tips and Tricks
yuichielectric
26
7.5k
The Future of C++ Interoperability: Insights from Porting a Game to Swift
teamhimeh
0
290
Go1.22からの疑似乱数生成器について/go-122-pseudo-random-generator
convto
1
160
もうすぐ新年度、Babylon.jsがお勧めな3個の理由
hideg
0
170
オブジェクト指向は必要なのか / Is object-oriented needed?
kishida
27
19k
受託開発でGitLab CI を活用していく
xiombatsg
1
130
Featured
See All Featured
A better future with KSS
kneath
230
16k
Art, The Web, and Tiny UX
lynnandtonic
288
19k
A Modern Web Designer's Workflow
chriscoyier
689
190k
Building Effective Engineering Teams - LeadDev
addyosmani
25
1.8k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
6
950
Docker and Python
trallard
33
2.6k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
319
20k
Fantastic passwords and where to find them - at NoRuKo
philnash
35
2.4k
Designing for Performance
lara
601
67k
The Invisible Customer
myddelton
114
12k
Documentation Writing (for coders)
carmenintech
59
3.8k
Producing Creativity
orderedlist
PRO
335
39k
Transcript
Reliable Cron Jobs in Distributed Environment Oleg Komarov 2012-11-24 1/26
Presentation available at https://speakerdeck.com/komarov/lpw-2012 http://bit.ly/VLuT6g 2/26
Context 3 independent projects with shared infrastructure • over 30
boxes • over 200 scripts, 30K+ SLOC • packaged in appr. 20 deb-packages 3/26
TL;DR Reliable Cron Jobs in Distributed Environment 4/26
TL;DR Reliable Cron Jobs in Distributed Environment ... are HARD
to get right 4/26
Cron Jobs in a Vacuum • locks • logging and
output • monitoring • profiling 5/26
Logging and Output • log START and FINISH • log
enough details 6/26
Logging and Output • log START and FINISH • log
enough details • use log + STDERR for important things • use MAILTO to catch that output 6/26
Logging and Output • log START and FINISH • log
enough details • use log + STDERR for important things • use MAILTO to catch that output 6/26
Monitoring • be confident that it actually works • it
must not fail when you system fails • have a plan of action 7/26
Monitoring • be confident that it actually works • it
must not fail when you system fails • have a plan of action 7/26
What to monitor • hardware errors • free disk space
• load • crond is alive • age of generated file, queue size, etc. 8/26
Profiling • Does it need 1GB or 10GB? • What
does it take so long to complete? • How many db queries does it run? 9/26
Profiling • Does it need 1GB or 10GB? • What
does it take so long to complete? • How many db queries does it run? Measure and improve 9/26
More to consider • crash-safe • documentation • parallel execution
• resource limits (ulimit/cgroups) 10/26
Deployment Packages 11/26
Deployment Boxes 12/26
Cron Package Just populate my-project-scriptsN.cron.d file 13/26
Cron Package Just populate my-project-scriptsN.cron.d file Don’t write it by
hand, do it automatically 13/26
Cron Package Just populate my-project-scriptsN.cron.d file Don’t write it by
hand, do it automatically Put some METADATA in your scripts 13/26
Metadata =head1 METADATA <crontab> package: scriptsN params: --mod 2 --rem
0 time: */2 * * * * </crontab> <crontab> package: scriptsN params: --mod 2 --rem 1 time: */2 * * * * </crontab> =cut 14/26
Simple Setup As simple as possible: one box per package
15/26
Simple Setup As simple as possible: one box per package
apt-get purge && kill (or wait) && apt-get install 15/26
!%*#$ Back to Earth Network 16/26
With Extra Boxes Now you have some promblems to solve:
• locks • logs • load 17/26
Net::ZooKeeper::Lock Apache ZooKeeperTM is an effort to develop and maintain
an open-source server which enables highly reliable distributed coordination. Net::ZooKeeper::Lock implements distributed locks via ZooKeeper. 18/26
Introducing Switchman https://github.com/komarov/switchman 19/26
Overview 20/26
Configuration Crontabs are installed everywhere, switchman consults with config in
ZooKeeper: { "groups": { "scripts1": "box1", "scripts2": "box1", "scripts3": ["box1", "box2"] } } 21/26
Description switchman --config /how/to/connect/to/zk --group scriptsN -- CMD ARGS 22/26
Description switchman --config /how/to/connect/to/zk --group scriptsN -- CMD ARGS •
checks configuration • acquires a lock • watches configuration for changes • stops execution when it is not allowed anymore 22/26
Description switchman --config /how/to/connect/to/zk --group scriptsN -- CMD ARGS •
checks configuration • acquires a lock • watches configuration for changes • stops execution when it is not allowed anymore Easy to adopt with METADATA 22/26
One Problem Solved • locks • logs • load 23/26
Further Steps See facebook’s Scribe for collecting decentralized logs Resources
reservation and management A good monitoring system 24/26
Thanks! Questions? https://speakerdeck.com/komarov/lpw-2012 http://bit.ly/VLuT6g http://about.me/komarov om 25/26
Bonus Slide Get file age: # in days perl -E
’say -M $ARGV[0]’ /path/to/file # in seconds expr ‘date +%s‘ - ‘date +%s -r /path/to/file‘ Simple local locks: use Pid::File::Flock qw/:auto/; 26/26