The Final Crontab
Selena Deckelmann
Data Architect, Mozilla
@selenamarie
http://chesnok.com/
Slide 2
Slide 2 text
crontabber
Slide 3
Slide 3 text
No content
Slide 4
Slide 4 text
No content
Slide 5
Slide 5 text
socorro1 socorro3
WAL
Socorro1
.dev
Socorro1.
stage
base_backup copy
Sunday noon PT
streaming rep
Prod
socorro2
backup4
base_backup &
pg_dump backup
reporting1
WAL
socorro-db-zeus-rw
socorro-db-zeus-ro
very architecture
very architecture
such replicas
such replicas
wow
wow
Slide 6
Slide 6 text
No content
Slide 7
Slide 7 text
No content
Slide 8
Slide 8 text
No content
Slide 9
Slide 9 text
No content
Slide 10
Slide 10 text
No content
Slide 11
Slide 11 text
Tons more at:
http://lqbs.fr/suchcomments/
Slide 12
Slide 12 text
No content
Slide 13
Slide 13 text
http://github.com/mozilla/socorro
Slide 14
Slide 14 text
http://bit.ly/1fOgBSB
Slide 15
Slide 15 text
*/5 * * * * socorro crontabber.sh
Slide 16
Slide 16 text
image by @CoryLoftis
Slide 17
Slide 17 text
Motivating factors
Slide 18
Slide 18 text
#ThreeWordHorrorStories
Slide 19
Slide 19 text
No unit tests
Slide 20
Slide 20 text
No unit tests
Slide 21
Slide 21 text
Bespoke shell scripts
Slide 22
Slide 22 text
Postgres stored procedures
Slide 23
Slide 23 text
Email from cron
Slide 24
Slide 24 text
0
5000
10000
15000
20000
25000
Dec 5, 2010 May 5, 2011 Oct 5, 2011 Mar 5, 2012 Aug 5, 2012 Jan 5, 2013 Jun 5, 2013 Nov 5, 2013 Apr 5, 2014
Cron alert messages
Slide 25
Slide 25 text
No content
Slide 26
Slide 26 text
Email from cron
that you need to read.
Slide 27
Slide 27 text
No content
Slide 28
Slide 28 text
Cron, what is it good for?
• birthday reminders
• status updates for a
website
• doxygen output for
manuals every 12 hours
• email nags about bugs
filed wrong
• ETL
• Postgres -> Cloudwatch
• Batch processing
• Backups of RO DB
• Machine heartbeat
• “sweet fuck all”
• “auto” updates
• logging laptop IP
• check for abandoned
twitter accounts
Slide 29
Slide 29 text
Running jobs on a
predictable schedule
Slide 30
Slide 30 text
How Socorro uses cron
• Time-dependent reports or maintenance
• “Simple” event detection and triggers
• Status logging
Slide 31
Slide 31 text
Our use cases
• Stored procedures for materialized views in
Postgres
• Daily map-reduces (largely deprecated)
• FTP Scraping into Postgres
• Bulk email responses to crash submissions
pulled from Elastic Search
Slide 32
Slide 32 text
Jobs that don’t lend themselves to
queue management because of
time-dependencies, fragility or
complexity.
Slide 33
Slide 33 text
crontabber
https://github.com/mozilla/crontabber
Slide 34
Slide 34 text
On Github:
Peter Bengtsson
@peterbe
&
Lars Lohn
@twobraids