S2GX 2013 - Integrating Spring Batch and Spring Integration

E76a1d9c75ae55e79418414de88f2c9a?s=47 Gunnar Hillert
September 10, 2013

S2GX 2013 - Integrating Spring Batch and Spring Integration

This talk is for everyone who wants to efficiently use Spring Batch and Spring Integration together. Users of Spring Batch often have the requirements to interact with other systems, to schedule the periodic execution Batch jobs and to monitor the execution of Batch jobs. Conversely, Spring Integration users periodically have Big Data processing requirements, be it for example the handling of large traditional batch files or the execution of Apache Hadoop jobs. For these scenarios, Spring Batch is the ideal solution. This session will introduce Spring Batch Integration, a project that provides support to easily tie Spring Batch and Spring Integration together. We will cover the following scenarios:

* Launch Batch Jobs through Spring Integration Messages
* Generate Informational Messages
* Externalize Batch Process Execution using Spring Integration
* Create Big Data Pipelines with Spring Batch and Spring Integration

E76a1d9c75ae55e79418414de88f2c9a?s=128

Gunnar Hillert

September 10, 2013
Tweet

Transcript

  1. © 2013 SpringOne 2GX. All rights reserved. Do not distribute

    without permission. Integrating Spring Batch and Spring Integration By Gunnar Hillert - @ghillert Michael Minella - @michaelminella Tuesday, September 10, 13
  2. 2 Tuesday, September 10, 13

  3. What we will cover... • Spring Batch • Spring Integration

    • Spring Batch Integration • Spring XD 3 Tuesday, September 10, 13
  4. The Pivotal One Platform • Application Fabric – Languages, Frameworks,

    Services, Analytics • Data Fabric – High Capacity, Real-time, Ingest & Query, Scale-out, Storage • Cloud Fabric – Automation, Service Registry, Cloud Independence 4 GemFire Tuesday, September 10, 13
  5. 5 Spring Batch http://projects.spring.io/spring-batch/ Tuesday, September 10, 13

  6. 6 Batch processing ... is defined as the processing of

    data without interaction or interruption. “ Michael T. Minella, Pro Spring Batch Tuesday, September 10, 13
  7. Batch Jobs • Long-running – Often outside office hours •

    Non-interactive – Often include logic for handling errors or restarts • Process large volumes of data – More than fits in memory or a single transaction 7 Tuesday, September 10, 13
  8. Batch and offline processing • Close of business processing –

    Order processing, Business reporting, Account reconciliation • Import/export handling – a.k.a. ETL jobs (Extract-Transform-Load) – Instrument/position import – Data warehouse synchronization • Large-scale output jobs – Loyalty scheme emails, Bank statements • Hadoop job orchestration 8 Tuesday, September 10, 13
  9. Features • Transaction management • Chunk based processing • Declarative

    I/O • Start/Restart/Skip capabilities • Web administration interface • Based on the Spring framework • JSR 352: Batch Applications for the Java Platform 9 Tuesday, September 10, 13
  10. Demo 10 Hello Spring Batch https://github.com/ghillert/spring-batch-sample-hello Tuesday, September 10, 13

  11. Concepts • Job • Step • Chunk • Item 11

    Repeat | Retry | Skip | Restart Tuesday, September 10, 13
  12. Chunk-Oriented Processing • Input-output can be grouped together • Input

    collects Items before outputting: Chunk-Oriented Processing • Optional ItemProcessor 12 Tuesday, September 10, 13
  13. Chunk-Oriented Processing 13 Tuesday, September 10, 13

  14. JobLauncher 14 Tuesday, September 10, 13

  15. Simple File Load Job 15 Tuesday, September 10, 13

  16. ItemReaders and ItemWriters • Flat File • XML (StAX) •

    Multi-File Input • Database – JDBC, JPA/Hibernate, Stored Procedures, Spring Data • JMS • Email • Implement your own... 16 Tuesday, September 10, 13
  17. Job Repository 17 Tuesday, September 10, 13

  18. Spring Batch Admin • Sub project of Spring Batch •

    Provides Web UI and REST interface to manage batch processes • Manager, Resources, Sample WAR – Deployed with batch job(s) as single app to be able to control & monitor jobs – Or monitors external jobs only via shared database 18 Tuesday, September 10, 13
  19. 19 Spring Integration http://projects.spring.io/spring-integration/ Tuesday, September 10, 13

  20. Integration Styles • File Transfer • Shared Database • Remoting

    • Messaging 20 Tuesday, September 10, 13
  21. Integration Styles • Business to Business Integration (B2B) • Inter

    Application Integration (EAI) • Intra Application Integration 21 JVM JVM EAI External Business Partner B2B Core Messaging Tuesday, September 10, 13
  22. Common Patterns 22 Retrieve Parse Transform Transmit Tuesday, September 10,

    13
  23. Enterprise Integration Patterns • By Gregor Hohpe & Bobby Woolf

    • Published 2003 • Collection of well-known patterns • Icon library provided 23 http://www.eaipatterns.com/eaipatterns.html Tuesday, September 10, 13
  24. 24 Spring Integration provides an extension of the Spring programming

    model to support the well-known enterprise integration patterns. “ Spring Integration Website Tuesday, September 10, 13
  25. Spring Integration Components 25 • Claim Check (In/Out) • Content

    Enricher • Header Enricher • Payload Enricher • Control Bus • Delayer • JMX Support • Message Handler Chain • Messaging Bridge • Resequencer • Service Activator • Scripting support (JSR 223) • Ruby/JRuby, Javascript ... • Groovy • Message History • Message Store • JDBC, Redis, MongoDB, Gemfire • Wire Tap • ... Tuesday, September 10, 13
  26. Adapters 26 • AMQP/RabbitMQ • AWS* • File/Resource • FTP/FTPS/SFTP

    • GemFire • HTTP (REST) • JDBC • JMS • JMX • JPA • MongoDB • POP3/IMAP/SMTP • Print • Redis • RMI • RSS/Atom • SMB • Splunk • Spring Application Events • Stored Procedures • TCP/UDP • Twitter • Web Services • XMPP • XPath • XQuery • ... Tuesday, September 10, 13
  27. Samples • https://github.com/SpringSource/spring-integration-samples • Contains 50 Samples and Applications •

    Several Categories: – Basic – Intermediate – Advanced – Applications 27 Tuesday, September 10, 13
  28. 28 Spring Batch Integration https://github.com/SpringSource/spring-batch-admin/ Tuesday, September 10, 13

  29. Launching batch jobs through messages • Event-Driven execution of the

    JobLauncher • Spring Integration retrieves the data (e.g. file system, FTP, ...) • Easy to support separate input sources simultaneously 29 D C FTP Inbound Channel Adapter JobLauncher Transformer File JobLaunchRequest Tuesday, September 10, 13
  30. JobLaunchRequest 30 public class FileMessageToJobRequest { private Job job; private

    String fileParameterName; ... @Transformer public JobLaunchRequest toRequest(Message<File> message) { JobParametersBuilder jobParametersBuilder = new JobParametersBuilder(); jobParametersBuilder.addString(fileParameterName, message.getPayload().getAbsolutePath()); return new JobLaunchRequest(job, jobParametersBuilder.toJobParameters()); } } Tuesday, September 10, 13
  31. DefaultJobParametersConverter • Convert (textual) Properties/Maps to JobParameters • Provide Typed

    Parameters – Date – String – Long – Double • Provide Date+Number Format • Define Identifying / Non-Identifying Parameters 31 myDateParam(date)=2013/09/10 aStringParameter=Hello Spring -stringParamNOTIdentifying=Hello Spring aNumberParameter(Long)=123456 Tuesday, September 10, 13
  32. JobLaunchRequest 32 <batch-int:job-launching-gateway request-channel="requestChannel" reply-channel="replyChannel" job-launcher="jobLauncher"/> Tuesday, September 10, 13

  33. Get feedback with informational messages • Spring Batch provides support

    for listeners: – StepExecutionListener – ChunkListener – JobExecutionListener 33 Tuesday, September 10, 13
  34. Get feedback with informational messages 34 <batch:job id="importPayments"> ... <batch:listeners>

    <batch:listener ref="notificationExecutionsListener"/> </batch:listeners> </batch:job> <int:gateway id="notificationExecutionsListener" service-interface="o.s.batch.core.JobExecutionListener" default-request-channel="jobExecutions"/> Tuesday, September 10, 13
  35. Externalizing batch process execution • Offload complex processing • Use

    Spring Integration inside of Batch jobs – e.g. ItemProcessor + ItemWriter • Asynchronous Item processing support – AsyncItemProcessor – AsyncItemWriter • Externalize Chunk processing – ChunkMessageChannelItemWriter 35 Tuesday, September 10, 13
  36. Asynchronous Processors • AsyncItemProcessor • AsyncItemWriter • externalize the processing

    of items 36 Reader Gateway Output Input Processor Writer Result Item Item Result Tuesday, September 10, 13
  37. Concurrent Step vs. Async Processors 37 Concurrent Step Async Processors

    Parallel Read/Write X Restartability X Write in Sequence X Tuesday, September 10, 13
  38. Remote Chunking 38 ItemReader ItemWriter ItemProcessor Step4 ItemReader ItemWriter ItemProcessor

    Step2b ItemReader ItemWriter Step2 ItemReader ItemWriter ItemProcessor Step1 ItemReader ItemWriter ItemProcessor Step2a ItemReader ItemWriter ItemProcessor Step2c Tuesday, September 10, 13
  39. Remote Partitioning 39 ItemReader ItemWriter ItemProcessor Step3 ItemReader ItemWriter ItemProcessor

    Slave 2 Master ItemReader ItemWriter ItemProcessor Step1 ItemReader ItemWriter ItemProcessor Slave 1 ItemReader ItemWriter ItemProcessor Slave 3 Partitioner Tuesday, September 10, 13
  40. Remote Partitioning • MessageChannelPartitionHandler 40 Partition Handler Remote Step Master

    Slave request gateway staging aggregator reply gateway staging request serviceActivator Tuesday, September 10, 13
  41. Demo 41 Payment Import https://github.com/ghillert/spring-batch-integration-sample Tuesday, September 10, 13

  42. 42 Spring XD http://projects.spring.io/spring-xd/ Tuesday, September 10, 13

  43. Tackling Big Data Complexity • Unified agile experience for •

    Data Ingestion • Real-time Analytics • Workflow Orchestration • Data Export 43 Tuesday, September 10, 13
  44. Tackling Big Data Complexity cont. • Built on existing assets

    – Spring Integration – Spring Batch – Spring Data • Redis, GemFire, Hadoop • XD = 'eXtreme Data’ 44 Tuesday, September 10, 13
  45. Data Ingestion Streams • DSL based on Unix pipes and

    filters syntax • Modules are parameterizable • Simple logic can be added via expressions or scripts 45 http | file twittersearch --query=spring | file --dir=/spring http | filter --expression=payload=='Spring' | hdfs Tuesday, September 10, 13
  46. Hadoop workflow managed by Spring Batch • Reuse Batch infrastructure

    and features to manage Hadoop workflows – Job state management, launching, monitoring, restart/retry policies, etc. • Step can be any Hadoop job type or HDFS script • Can mix and match with other Batch readers/writers – (e.g. JDBC for import/export use-cases) 46 Tuesday, September 10, 13
  47. Demo 47 Spring XD Batch word-count sample https://github.com/SpringSource/spring-xd-samples Tuesday, September

    10, 13
  48. Books 48 Tuesday, September 10, 13

  49. Learn More. Stay Connected. JSR-352, Spring Batch and You @14:30

    (Tue) Spring Integration Internals @12:45 (Wed) Tackling Big Data Complexity with Spring @14:30 (Wed) Real Time Analytics with Spring @16:30 (Wed) Talk to us on Twitter: @springcentral Find session replays on YouTube: spring.io/video Tuesday, September 10, 13