Save 37% off PRO during our Black Friday Sale! »

EasyBatch DevoxxMA 2016

EasyBatch DevoxxMA 2016

18b9001fd20c1e089d19f4a1e994bcdc?s=128

Mahmoud Ben Hassine

November 02, 2016
Tweet

Transcript

  1. Simplifiez vos traitements par lots en Java avec Easy Batch

    Mahmoud Ben Hassine @b_e_n_a_s https://benas.github.io #EasyBatch
  2. Agenda #EasyBatch • Introduction • State of the art •

    Motivation • Easy Batch • Overview • Examples • Wrap-up 2 @b_e_n_a_s
  3. Introduction #EasyBatch 3 Batch processing Stream processing Bounded data set

    Unbounded data stream High latency Low latency Static data set Dynamic data stream @b_e_n_a_s
  4. Introduction #EasyBatch 3 Batch processing Stream processing Bounded data set

    Unbounded data stream High latency Low latency Static data set Dynamic data stream @b_e_n_a_s
  5. State of the art JSR 352 #EasyBatch 4 4 @b_e_n_a_s

  6. State of the art Excellent solutions! But .. JSR 352

    #EasyBatch 4 4 @b_e_n_a_s
  7. State of the art Excellent solutions! But .. • Steep

    learning curve JSR 352 #EasyBatch 4 4 @b_e_n_a_s
  8. State of the art Excellent solutions! But .. • Steep

    learning curve • Complex configuration JSR 352 #EasyBatch 4 4 @b_e_n_a_s
  9. State of the art Excellent solutions! But .. • Steep

    learning curve • Complex configuration • Mandatory components that you may not need JSR 352 #EasyBatch 4 4 @b_e_n_a_s
  10. Agenda • Introduction • State of the art • Motivation

    • Easy Batch • Overview • Examples • Wrap-up #EasyBatch 5 5 @b_e_n_a_s
  11. Motivation #id,name,description,price,published,lastUpdate 0001,product1,description1,2500,true,2014-01-01 000x,product2,description2,2400,true,2014-01-01 0003,,description3,2300,true,2014-01-01 0004,product4,description4,-2200,true,2014-01-01 0005,product5,description5,2100,true,2024-01-01 0006,product6,description6,2000,true,2014-01-01,Blah! 0007,product7,description7,2100,true,2024-01-01 public

    class Product { private long id; private String name; private String description; private double price; private boolean published; private Date lastUpdate; } products.csv Common requirements: - Read file line by line - Filter header record - Parse and map data to the Product bean - Validate product data - Do something with the product (business logic) - Log errors - Report statistics Product.java #EasyBatch 6 @b_e_n_a_s
  12. Motivation #id,name,description,price,published,lastUpdate 0001,product1,description1,2500,true,2014-01-01 000x,product2,description2,2400,true,2014-01-01 0003,,description3,2300,true,2014-01-01 0004,product4,description4,-2200,true,2014-01-01 0005,product5,description5,2100,true,2024-01-01 0006,product6,description6,2000,true,2014-01-01,Blah! 0007,product7,description7,2100,true,2024-01-01 public

    class Product { private long id; private String name; private String description; private double price; private boolean published; private Date lastUpdate; } products.csv Common requirements: - Read file line by line - Filter header record - Parse and map data to the Product bean - Validate product data - Do something with the product (business logic) - Log errors - Report statistics Boilerplate Product.java #EasyBatch 6 @b_e_n_a_s
  13. Motivation #id,name,description,price,published,lastUpdate 0001,product1,description1,2500,true,2014-01-01 000x,product2,description2,2400,true,2014-01-01 0003,,description3,2300,true,2014-01-01 0004,product4,description4,-2200,true,2014-01-01 0005,product5,description5,2100,true,2024-01-01 0006,product6,description6,2000,true,2014-01-01,Blah! 0007,product7,description7,2100,true,2024-01-01 public

    class Product { private long id; private String name; private String description; private double price; private boolean published; private Date lastUpdate; } products.csv Common requirements: - Read file line by line - Filter header record - Parse and map data to the Product bean - Validate product data - Do something with the product (business logic) - Log errors - Report statistics The goal is to keep focus on business logic! Boilerplate Product.java #EasyBatch 6 @b_e_n_a_s
  14. Agenda • Introduction • State of the art • Motivation

    • Easy Batch • Overview • Examples • Wrap-up #EasyBatch 7 @b_e_n_a_s
  15. Easy Batch? Kesako? • Name: Easy Batch • Date of

    birth: 13/08/2012 • Weight: 124 Kb • DNA: https://github.com/j-easy/easy-batch #EasyBatch 8 @b_e_n_a_s
  16. The Job abstraction #EasyBatch 9 public interface Job 
 extends

    Callable<JobReport> { String getName(); } class BatchJob implements Job { } @b_e_n_a_s
  17. The Job abstraction #EasyBatch 9 public interface Job 
 extends

    Callable<JobReport> { String getName(); } class BatchJob implements Job { } • Synchronous execution
 JobReport report = jobExecutor.execute(job); @b_e_n_a_s
  18. The Job abstraction #EasyBatch 9 public interface Job 
 extends

    Callable<JobReport> { String getName(); } class BatchJob implements Job { } • Synchronous execution
 JobReport report = jobExecutor.execute(job); • Asynchronous execution
 Future<JobReport> report = jobExecutor.submit(job); @b_e_n_a_s
  19. The Job abstraction #EasyBatch 9 public interface Job 
 extends

    Callable<JobReport> { String getName(); } class BatchJob implements Job { } • Synchronous execution
 JobReport report = jobExecutor.execute(job); • Asynchronous execution
 Future<JobReport> report = jobExecutor.submit(job); • Parallel executions
 jobExecutor
 .submitAll(job1, job2); @b_e_n_a_s
  20. The Job abstraction #EasyBatch 9 public interface Job 
 extends

    Callable<JobReport> { String getName(); } class BatchJob implements Job { } • Synchronous execution
 JobReport report = jobExecutor.execute(job); • Asynchronous execution
 Future<JobReport> report = jobExecutor.submit(job); • Parallel executions
 jobExecutor
 .submitAll(job1, job2); • Job scheduling
 scheduledExecutorService
 .schedule(job, 2, MINUTES); @b_e_n_a_s
  21. The Record abstraction Header
 (No, Source, etc) Payload (Raw Data)

    Record Record.java public interface Record<P> { /** Header of the record */ Header getHeader(); /** Payload of the record */ P getPayload(); } #EasyBatch 10 @b_e_n_a_s
  22. The Record abstraction Header
 (No, Source, etc) Payload (Raw Data)

    Record Multiple implementations: 
 FlatFileRecord, XmlRecord, JsonRecord, JdbcRecord, JmsRecord, etc.. Record.java public interface Record<P> { /** Header of the record */ Header getHeader(); /** Payload of the record */ P getPayload(); } #EasyBatch 10 @b_e_n_a_s
  23. The Batch abstraction { record 1, record 2, ... record

    n } Batch public class Batch 
 implements Iterable<Record> { private List<Record> records; } Batch.java #EasyBatch 11 @b_e_n_a_s
  24. Batch Jobs #EasyBatch 12 @b_e_n_a_s

  25. Batch Jobs • Read records in sequence #EasyBatch 12 @b_e_n_a_s

  26. Batch Jobs • Read records in sequence • Process records

    in pipeline #EasyBatch 12 @b_e_n_a_s
  27. Batch Jobs • Read records in sequence • Process records

    in pipeline • Write records in batches #EasyBatch 12 @b_e_n_a_s
  28. Agenda • Introduction • State of the art • Motivation

    • Easy Batch • Overview • Examples • Wrap-up #EasyBatch 13 @b_e_n_a_s
  29. Agenda • Introduction • State of the art • Motivation

    • Easy Batch • Overview • Examples • Wrap-up #EasyBatch 14 @b_e_n_a_s
  30. Wrap up • No retry on failure • No flows

    • No data partitioning • No remote chunking #EasyBatch 15 • Lightweight, free and open source • Easy to learn, configure and use • Flexible & extensible API • Modular architecture • Declarative data validation • Real-time monitoring The good ones The not so good ones @b_e_n_a_s
  31. Final word: Be pragmatic! #EasyBatch 16 @b_e_n_a_s

  32. Thank you! • Slides: http://speakerdeck.com/benas/easybatch-devoxxma-2016 • Code: https://github.com/j-easy/easy-batch #EasyBatch @b_e_n_a_s