Slide 1

Slide 1 text

Data Integration and Real-time Data Processing with Spring Boot Sabby Anandan @sabbyanandan

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

- Introduce Spring projects - Discuss how they address Data Integration / Data Processing Challenges Goal:

Slide 4

Slide 4 text

A toolkit for building data integration, real- time streaming, and batch data processing pipelines. Spring Cloud Data Flow

Slide 5

Slide 5 text

A toolkit for building data integration, real- time streaming, and batch data processing pipelines. Spring Cloud Data Flow

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

- Decentralization (no ESB) - Lightweight applications that contain integration logic - Loose coupling through message channels Spring Integration

Slide 8

Slide 8 text

A toolkit for building data integration, real- time streaming, and batch data processing pipelines. Spring Cloud Data Flow

Slide 9

Slide 9 text

Stream processing is much more than just EIP

Slide 10

Slide 10 text

Spring Cloud Stream a event-driven microservices framework @EnableBinding(Processor.class) public class Application { @StreamListener("foo") @SendTo("bar") public String replaceStringMsgHandler(String payload) { return StringUtils.replace(payload, "foo", "bar"); } } B I N D I N G E V E N T S foo channel B I N D I N G channel bar C O N S U M E R S Programming Model: Message Channel Abstraction

Slide 11

Slide 11 text

Spring Cloud Stream a event-driven microservices framework @EnableBinding(Processor.class) public class Application { @StreamListener("foo") @SendTo("bar") public KStream handler(KStream input){ return . .; } } E V E N T S B I N D I N G foo channel bar B I N D I N G channel C O N S U M E R S Programming Model: Native Kafka Streams

Slide 12

Slide 12 text

Spring Cloud Stream a event-driven microservices framework @EnableBinding(Processor.class) public class Application { @StreamListener("foo") @SendTo("bar") public Flux sensorAverage(Flux data) { return . .; } } E V E N T S B I N D I N G foo channel bar B I N D I N G channel C O N S U M E R S Programming Model: Native Reactor Flux’s

Slide 13

Slide 13 text

Spring Cloud Stream a event-driven microservices framework @EnableBinding(Processor.class) public class Application { @Bean public Function toUpperCase() { return s -> s.toUpperCase(); } } E V E N T S B I N D I N G foo channel bar B I N D I N G channel C O N S U M E R S Programming Model: Plain Old Java Functions

Slide 14

Slide 14 text

Spring Cloud Stream a event-driven microservices framework Pluggable Binder Implementations Stream Partitions Consumer Groups Message Headers Testing Framework Content-type Negotiation Imperative + Functional Programming Model public class TransferServiceImpl implements TransferService { public TransferServiceImpl(AccountRepository ar) { this.accountRepository = ar; }

Slide 15

Slide 15 text

Spring Cloud Stream a event-driven microservices framework Pluggable Binder Implementations Rabbit MQ Apache Kafka Google PubSub Amazon Kinesis Azure Event Hubs Solace Same code + Same test-harness Drop-in replacement for a variety of Messaging Systems Opportunities:

Slide 16

Slide 16 text

Spring Cloud Stream a event-driven microservices framework Pluggable Binder Implementations Stream Partitions Consumer Groups Message Headers Testing Framework Content-type Negotiation Imperative + Functional Programming Model public class TransferServiceImpl implements TransferService { public TransferServiceImpl(AccountRepository ar) { this.accountRepository = ar; }

Slide 17

Slide 17 text

Demo #1 Events and Data Intensive Applications

Slide 18

Slide 18 text

:UserCreated :UserCreated :UserNameChanged :UserActivated

Slide 19

Slide 19 text

User activity in the last 30s Users created in the last 2 mins User interaction by region in the last 1hr window

Slide 20

Slide 20 text

1 3 2 1 2 3 Websocket Connection REST request/response REST request/response

Slide 21

Slide 21 text

A toolkit for building data integration, real- time streaming, and batch data processing pipelines. Spring Cloud Data Flow

Slide 22

Slide 22 text

JSR 352

Slide 23

Slide 23 text

- Closely related processing steps that perform a discrete business process - Deployable unit comprised of one or more job steps - Lifecycle management for jobs/steps Spring Batch JSR 352

Slide 24

Slide 24 text

Spring Cloud Task a short-lived microservices framework @EnableTask @EnableBatchProcessing public class BatchJobApplication { @Bean public Step extractStep() { // extract business logic } @Bean public Step transformStep() { // transformation logic } @Bean public Step loadStep() { // persistence logic } @Bean public Job etlJob() { return this.jobBuilderFactory.get("etlJob") .start(extractStep()) .next(transformStep()) .next(loadStep()) .build(); } } Database R E P O S I T O R Y Programming Model: Spring Batch Job as Short-lived Application

Slide 25

Slide 25 text

Spring Cloud Task a short-lived microservices framework @EnableTask public class TimestampTask { @Bean public TimestampTask timeStampTask() { return new TimestampTask(); } public static class TimestampTask implements CommandLineRunner { @Override public void run(String... strings) throws Exception { DateFormat dateFormat = . . logger.info(dateFormat.format(new Date())); } } } Database R E P O S I T O R Y Programming Model: An arbitrary business-logic as Short-lived Application

Slide 26

Slide 26 text

Spring Cloud Task a short-lived microservices framework Lifecycle Management Transactions Bookkeeping for Restarts/Replay Historical Representation Remote Partitions

Slide 27

Slide 27 text

Demo #2 24/7 ETL: Cloud-native File Ingest

Slide 28

Slide 28 text

SFTP Source TaskLauncher ETL Job/Task Database Orchestrated by Spring Cloud Data Flow SFTP Server poll for new files publish each file launch task for each file persist parsed data

Slide 29

Slide 29 text

A toolkit for building data integration, real- time streaming, and batch data processing pipelines. Spring Cloud Data Flow

Slide 30

Slide 30 text

A toolkit for building data integration, real- time streaming, and batch data processing pipelines. Spring Cloud Data Flow But wait, there’s more!

Slide 31

Slide 31 text

Developer Focus Ideation Implementation Production

Slide 32

Slide 32 text

Demo #3 CI/CD for Data Pipelines

Slide 33

Slide 33 text

Mask each Payload 111-22-3333 444-55-6666 777-88-9999 . . . The Security Number = xxx-xx-3333 The Security Number = xxx-xx-6666 The Security Number = xxx-xx-9999 . . . Don’t Disturb Don’t Disturb Fix This!

Slide 34

Slide 34 text

No content

Slide 35

Slide 35 text

Let’s Recap

Slide 36

Slide 36 text

Spring Cloud Stream Build highly scalable event-driven microservices connected with shared messaging systems. Spring Cloud Task Build short-lived microservices to perform data processing locally or in the cloud. Spring Cloud Skipper Discover applications and manage their lifecycle on multiple Cloud Platforms. Spring Cloud Data Flow Orchestrate data pipelines made of Spring Cloud Stream or Spring Cloud Task microservices. Consolidate Development and Testing Practices Standardize CI/CD Tooling & Automation Opportunities:

Slide 37

Slide 37 text

Next Spring Boot 2.1 Compatibility: Stream, Task, Skipper, and SCDF Function Composition / Function Chaining OAuth2 + OpenID Connect by Default Deeper Integration with Micrometer for Metrics/Monitoring New Data Integration Apps

Slide 38

Slide 38 text

Resources Spring Cloud Stream Samples | Gitter | StackOverflow Spring Cloud Task Samples | Gitter | StackOverflow Spring Cloud Skipper Samples | Gitter | StackOverflow Spring Cloud Data Flow Samples | Gitter | StackOverflow Demo #1: Events + Kafka Streams Demo #2: File-ingest Demo #3: CI/CD for Data Pipelines

Slide 39

Slide 39 text

Q+A