@purbon
Connecting the data
infrastructure with the
DataFlow
Slide 2
Slide 2 text
@purbon
Pere Urbon-Bayes
Software Architect
pere.urbon@{gmail.com, acm.org}
Slide 3
Slide 3 text
Topics for Today
• Integration patterns for the enterprise startup.
• What is Apache NIFI.
• Examples
• NiFi on operation (best practises).
Slide 4
Slide 4 text
@purbon
Integrate all the
things!
Slide 5
Slide 5 text
@purbon
Enterprise integration is the task of making
separate applications work together to produce
an unified set of functionality.
The applications probably run on multiple
computers, which may be geographically
dispersed.
Slide 6
Slide 6 text
@purbon
Some application might need to be integrated
even though they were not designed for
integration and can not be changed.
This issues, and others, are what makes
application integration difficult.
Slide 7
Slide 7 text
@purbon
Each integration faces different needs and
criteria, we can group them as
Application coupling
Integration simplicity
Data formats and timeliness
Data or functionality
Communication
Slide 8
Slide 8 text
@purbon
There is only a limited set of integration
options
Slide 9
Slide 9 text
@purbon
File transfer
Slide 10
Slide 10 text
@purbon
Shared database
Slide 11
Slide 11 text
@purbon
RPC invoke
Slide 12
Slide 12 text
@purbon
Messaging
Slide 13
Slide 13 text
@purbon
Enterprise Integration Patterns
Slide 14
Slide 14 text
@purbon
What is Apache NiFi?
Slide 15
Slide 15 text
@purbon
An easy to use, powerful, and reliable system to
process and distribute data.
Web-based interface
Highly configurable
Data Provenance
Designed for extension
Secure
Slide 16
Slide 16 text
@purbon
NiFi was build to automate the flow of data
between systems.
an automated and managed flow of information
between systems.
But what is Dataflow?
Slide 17
Slide 17 text
@purbon
How Apache NiFi look like
Slide 18
Slide 18 text
@purbon
Concepts behind Apache NiFi
Slide 19
Slide 19 text
@purbon
A Flow file
Slide 20
Slide 20 text
@purbon
The Flow file Processor
Slide 21
Slide 21 text
@purbon
A Connection
Slide 22
Slide 22 text
@purbon
A Process Group
Slide 23
Slide 23 text
@purbon
Apache NiFi Architecture
Distributed using Apache Zookeper
Slide 24
Slide 24 text
@purbon
Let’s take a closer
look…
Slide 25
Slide 25 text
@purbon
Apache NiFI
Operations
Slide 26
Slide 26 text
@purbon
Maximum file handles
hard nofile 50000
soft nofile 50000
/etc/security/limits.conf
Slide 27
Slide 27 text
@purbon
Maximum forked Procs
hard nproc 10000
soft nproc 10000
/etc/security/limits.conf
/etc/security/limits.d/90-nproc.conf
Slide 28
Slide 28 text
@purbon
Increase number of TCP sockets
sudo sysctl -w net.ipv4.ip_local_port_range="10000 65000"
Slide 29
Slide 29 text
@purbon
Timeout sockets in TIMED_WAIT state
sudo sysctl -w
net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_wait="1"