Upgrade to Pro — share decks privately, control downloads, hide ads and more …

EasyBatch DevoxxMA 2016

EasyBatch DevoxxMA 2016

Mahmoud Ben Hassine

November 02, 2016
Tweet

More Decks by Mahmoud Ben Hassine

Other Decks in Programming

Transcript

  1. Simplifiez vos traitements par lots en Java avec Easy Batch

    Mahmoud Ben Hassine @b_e_n_a_s https://benas.github.io #EasyBatch
  2. Agenda #EasyBatch • Introduction • State of the art •

    Motivation • Easy Batch • Overview • Examples • Wrap-up 2 @b_e_n_a_s
  3. Introduction #EasyBatch 3 Batch processing Stream processing Bounded data set

    Unbounded data stream High latency Low latency Static data set Dynamic data stream @b_e_n_a_s
  4. Introduction #EasyBatch 3 Batch processing Stream processing Bounded data set

    Unbounded data stream High latency Low latency Static data set Dynamic data stream @b_e_n_a_s
  5. State of the art Excellent solutions! But .. • Steep

    learning curve JSR 352 #EasyBatch 4 4 @b_e_n_a_s
  6. State of the art Excellent solutions! But .. • Steep

    learning curve • Complex configuration JSR 352 #EasyBatch 4 4 @b_e_n_a_s
  7. State of the art Excellent solutions! But .. • Steep

    learning curve • Complex configuration • Mandatory components that you may not need JSR 352 #EasyBatch 4 4 @b_e_n_a_s
  8. Agenda • Introduction • State of the art • Motivation

    • Easy Batch • Overview • Examples • Wrap-up #EasyBatch 5 5 @b_e_n_a_s
  9. Motivation #id,name,description,price,published,lastUpdate 0001,product1,description1,2500,true,2014-01-01 000x,product2,description2,2400,true,2014-01-01 0003,,description3,2300,true,2014-01-01 0004,product4,description4,-2200,true,2014-01-01 0005,product5,description5,2100,true,2024-01-01 0006,product6,description6,2000,true,2014-01-01,Blah! 0007,product7,description7,2100,true,2024-01-01 public

    class Product { private long id; private String name; private String description; private double price; private boolean published; private Date lastUpdate; } products.csv Common requirements: - Read file line by line - Filter header record - Parse and map data to the Product bean - Validate product data - Do something with the product (business logic) - Log errors - Report statistics Product.java #EasyBatch 6 @b_e_n_a_s
  10. Motivation #id,name,description,price,published,lastUpdate 0001,product1,description1,2500,true,2014-01-01 000x,product2,description2,2400,true,2014-01-01 0003,,description3,2300,true,2014-01-01 0004,product4,description4,-2200,true,2014-01-01 0005,product5,description5,2100,true,2024-01-01 0006,product6,description6,2000,true,2014-01-01,Blah! 0007,product7,description7,2100,true,2024-01-01 public

    class Product { private long id; private String name; private String description; private double price; private boolean published; private Date lastUpdate; } products.csv Common requirements: - Read file line by line - Filter header record - Parse and map data to the Product bean - Validate product data - Do something with the product (business logic) - Log errors - Report statistics Boilerplate Product.java #EasyBatch 6 @b_e_n_a_s
  11. Motivation #id,name,description,price,published,lastUpdate 0001,product1,description1,2500,true,2014-01-01 000x,product2,description2,2400,true,2014-01-01 0003,,description3,2300,true,2014-01-01 0004,product4,description4,-2200,true,2014-01-01 0005,product5,description5,2100,true,2024-01-01 0006,product6,description6,2000,true,2014-01-01,Blah! 0007,product7,description7,2100,true,2024-01-01 public

    class Product { private long id; private String name; private String description; private double price; private boolean published; private Date lastUpdate; } products.csv Common requirements: - Read file line by line - Filter header record - Parse and map data to the Product bean - Validate product data - Do something with the product (business logic) - Log errors - Report statistics The goal is to keep focus on business logic! Boilerplate Product.java #EasyBatch 6 @b_e_n_a_s
  12. Agenda • Introduction • State of the art • Motivation

    • Easy Batch • Overview • Examples • Wrap-up #EasyBatch 7 @b_e_n_a_s
  13. Easy Batch? Kesako? • Name: Easy Batch • Date of

    birth: 13/08/2012 • Weight: 124 Kb • DNA: https://github.com/j-easy/easy-batch #EasyBatch 8 @b_e_n_a_s
  14. The Job abstraction #EasyBatch 9 public interface Job 
 extends

    Callable<JobReport> { String getName(); } class BatchJob implements Job { } @b_e_n_a_s
  15. The Job abstraction #EasyBatch 9 public interface Job 
 extends

    Callable<JobReport> { String getName(); } class BatchJob implements Job { } • Synchronous execution
 JobReport report = jobExecutor.execute(job); @b_e_n_a_s
  16. The Job abstraction #EasyBatch 9 public interface Job 
 extends

    Callable<JobReport> { String getName(); } class BatchJob implements Job { } • Synchronous execution
 JobReport report = jobExecutor.execute(job); • Asynchronous execution
 Future<JobReport> report = jobExecutor.submit(job); @b_e_n_a_s
  17. The Job abstraction #EasyBatch 9 public interface Job 
 extends

    Callable<JobReport> { String getName(); } class BatchJob implements Job { } • Synchronous execution
 JobReport report = jobExecutor.execute(job); • Asynchronous execution
 Future<JobReport> report = jobExecutor.submit(job); • Parallel executions
 jobExecutor
 .submitAll(job1, job2); @b_e_n_a_s
  18. The Job abstraction #EasyBatch 9 public interface Job 
 extends

    Callable<JobReport> { String getName(); } class BatchJob implements Job { } • Synchronous execution
 JobReport report = jobExecutor.execute(job); • Asynchronous execution
 Future<JobReport> report = jobExecutor.submit(job); • Parallel executions
 jobExecutor
 .submitAll(job1, job2); • Job scheduling
 scheduledExecutorService
 .schedule(job, 2, MINUTES); @b_e_n_a_s
  19. The Record abstraction Header
 (No, Source, etc) Payload (Raw Data)

    Record Record.java public interface Record<P> { /** Header of the record */ Header getHeader(); /** Payload of the record */ P getPayload(); } #EasyBatch 10 @b_e_n_a_s
  20. The Record abstraction Header
 (No, Source, etc) Payload (Raw Data)

    Record Multiple implementations: 
 FlatFileRecord, XmlRecord, JsonRecord, JdbcRecord, JmsRecord, etc.. Record.java public interface Record<P> { /** Header of the record */ Header getHeader(); /** Payload of the record */ P getPayload(); } #EasyBatch 10 @b_e_n_a_s
  21. The Batch abstraction { record 1, record 2, ... record

    n } Batch public class Batch 
 implements Iterable<Record> { private List<Record> records; } Batch.java #EasyBatch 11 @b_e_n_a_s
  22. Batch Jobs • Read records in sequence • Process records

    in pipeline • Write records in batches #EasyBatch 12 @b_e_n_a_s
  23. Agenda • Introduction • State of the art • Motivation

    • Easy Batch • Overview • Examples • Wrap-up #EasyBatch 13 @b_e_n_a_s
  24. Agenda • Introduction • State of the art • Motivation

    • Easy Batch • Overview • Examples • Wrap-up #EasyBatch 14 @b_e_n_a_s
  25. Wrap up • No retry on failure • No flows

    • No data partitioning • No remote chunking #EasyBatch 15 • Lightweight, free and open source • Easy to learn, configure and use • Flexible & extensible API • Modular architecture • Declarative data validation • Real-time monitoring The good ones The not so good ones @b_e_n_a_s