Upgrade to Pro — share decks privately, control downloads, hide ads and more …

52 weeks after Digdag operation

4548ab6d7151e0240340f4ee02c2e78c?s=47 amesho
February 21, 2018

52 weeks after Digdag operation

About one year passed since Digdag was introduced. How did batch operation change during that period? How did the management consciousness change? What was going to be seen? What is troubled at present? Plugin We announce you made.



February 21, 2018


  1. 52 weeks after Digdag operation

  2. Self-introduction 4 id: @amesho 4 role: Operation,Development 4 product: Quant

  3. Quant 4 constitution 4 rails 4 redshift 4 aurora 4

    backend 4 Dependency complexity
  4. Agenda 4 How has batch operation changed? 4 How has

    the situation changed? 4 Problems with operation.
  5. How has batch operation changed?

  6. Before

  7. Cron

  8. Problems of CRON 4 It is hard to understand which

    server is running 4 It is hard to understand startup time 4 Difficult to understand dependency 4 Elapsed time is difficult to understand
  9. Problem solved by introducing digdag

  10. Which server is running

  11. Start-up time

  12. Dependency

  13. Elapsed time

  14. How has the situation changed 4 It became easy to

    match recognition with your boss and colleagues 4 People who can not write programs can join 4 Safe to understand what you are doing
  15. Operation 4 pullreq 4 checkout with digdag server 4 Processing

    in combination with embulk
  16. Note 4 If you look closely at the document you

    will find detailed notes etc 4 Some things are not written in the document
  17. API 4 It exists under digdag-server Ϋϥε ϝιου API ֓ཁ

    SessionResource GET /api/sessions List sessions from recent to old GET /api/sessions/{id} Get a session by id GET /api/sessions/{id}/attempts List attempts of a session AdminResource GET /api/admin/attempts/{id}/userinfo AttemptResource GET /api/attempts list attempts from recent to old GET /api/attempts?include_retried=1 list attempts from recent to old GET /api/attempts?project=<name> list attempts that belong to a particular project GET /api/attempts?project=<name>&workflow=<name> list attempts that belong to a particular workflow GET /api/attempts/{id} show a session GET /api/attempts/{id}/tasks list tasks of a session GET /api/attempts/{id}/retries list retried attempts of this session PUT /api/attempts starts a new session POST /api/attempts/{id}/kill kill a session
  18. Sample API access #!/bin/env ruby require 'net/http' require 'uri' require

    'json' require 'time' url = URI.parse('http://localhost:65432/') res = Net::HTTP.start(url.host, url.port) {|http| http.get('/api/schedules') } schedules = JSON.parse(res.body) schedules['schedules'].sort_by { |s| Time.strptime(s['nextRunTime'],'%Y-%m-%dT%H:%M:%S%z').to_i }.each do |row| next if row['disabledAt'] printf("%s\t%s/%s\n",Time.strptime(row['nextRunTime'],'%Y-%m-%dT%H:%M:%S%z').localtime,row['project']['name'],row['workflow']['name']) end
  19. storelastresults of operator

  20. storelastresults: BOOLEAN 4 Whether to store the query results to

    redshift.last_results parameter. Default: false. 4 Setting first stores the first row to the parameter as an object (e.g. ${redshift.last_results.count}). 4 Setting all stores all rows to the parameter as an array of objects (e.g. $ {redshift.last_results[0].name}). If number of rows exceeds limit, task fails.
  21. td operator

  22. redshift operators

  23. redshift operator is not implemented It is described in comments

    // TODO store_last_results should be io.digdag.standards.operator.jdbc.StoreLastResultsOption // instead of boolean to be consistent with pg> and redshift> operators but not implemented yet. this.storeLastResults = params.get("store_last_results", boolean.class, false);
  24. Summary

  25. end