Empowering Java Applications with NoSQL: A Hands-On Workshop

Exploring NoSQL Polyglot Persistence with Java

Otávio Santana @otaviosantana Welcome Topics Persistence introduction ◦ Persistence challenges
◦ Persistence Universe Jakarta EE ◦ JAR-RS ◦ CDI ◦ Bean Validation NoSQL databases ◦ Key-value (Redis) ◦ Wide-Column (Apache Cassandra) ◦ Document (MongoDB) ◦ Graph (Neo4J) [Bonus]

Otávio Santana @otaviosantana Setup

Otávio Santana @otaviosantana Requirements https://o-s-expert.github.io/polyglot-nosql/ ✔ JDK 17 + installed
✔ Modern IDE (Intellij, VSCode, etc...) ✔ Git ✔ Docker ✔ Docker Compose Setup

Otávio Santana @otaviosantana Welcome Introduce yourself ✔ What is your
name? ✔ Where are you from? ✔ What is your Java and Database experience? ✔ Fun fact about you.

Otávio Santana @otaviosantana NoSQL Introduction

Otávio Santana @otaviosantana Endures knowledge Throughout humanity Temples, Caves as
database Why do we need databases?

Otávio Santana @otaviosantana Database evolutions Why do we need them?

Otávio Santana @otaviosantana Why Do Modern Applications Need Data Storage?
The new opportunities The current opportunity It's where the state is Business rules (database?)

Otávio Santana @otaviosantana The data cost Challenge has changed

Otávio Santana @otaviosantana The time is the new cost Nobody
wants to wait

Otávio Santana @otaviosantana Joins vs. data volume Everything has trade-offs,
including normalization https://medium.com/@benmorel/to-join-or-not-to-join-bba9c1377c10

Otávio Santana @otaviosantana key value key key key value value
value Wide-Column Graph Document Key Value NoSQL Database SQL Database Database solutions Polyglot Persistence

Maturity Model Database ﬂavors Paradigms Persistence landscape State of affairs
https://survey.stackoverflow.co/2023/

Microservices vs Database Database agonistic Trade-offs Persistence landscape State of
affairs https://survey.stackoverflow.co/2023/

Otávio Santana @otaviosantana Development lifecycle very often starts with immature
data structures Changing a database type in existing applications is complex and expensive. Over time, maintenance of data and schema evolution gets challenging First thing we do Persistence landscape Change Maintenance State of affairs Evolutionary Data

Otávio Santana @otaviosantana Application (Object Oriented Language) Mismatch Database (Relational)
Object Challenges in a database land Different paradigms: Apps x DBMS Tables

Otávio Santana @otaviosantana Mapping Mismatch 1 * Addresses Inheritance Polymorphism
Encapsulation Types Normalization Denormalization Structure Application Database key val ue key key key valu e val ue val ue wide-Column Graph Document Key Value Challenges in a database land Different paradigms: Apps x DBMS

Otávio Santana @otaviosantana Driver Data Mapper Database integration Data Mapping
Handling Data Integration in Java Active Record Repository

Otávio Santana @otaviosantana Driver Database integration Data Mapping Database Application
Data Mapper Active Record Repository DAO Data-oriented Programing Object Oriented Programming

Otávio Santana @otaviosantana Database-oriented Programming 4 Principles Database Data-oriented Programing
1. Separating code (behavior) from data. 2. Representing data with generic data structures. 3. Treating data as immutable. 4. Separating data schema from data representation.

Otávio Santana @otaviosantana Object Oriented Programming Principles Application Object Oriented
Programming 1. Expose behavior 2. Hide data 3. Explore Abstraction 4. Use of layers and Modules

Otávio Santana @otaviosantana Driver Data Mapping try(Connection conn = DriverManager.getConnection(DB_URL,
USER, PASS){ Statement stmt = conn.createStatement(); ResultSet rs = stmt.executeQuery(QUERY);) { // Extract data from result set while (rs.next()) { // Retrieve by column name System.out.print("ID: " + rs.getInt("id")); System.out.print(",name: " + rs.getInt("name")); System.out.print(",birthday: " + rs.getString("birthday")); System.out.println(",city: " + rs.getString("city")); System.out.println(",street: " + rs.getString("street")); …} } Complex relation w/ business logic Code ﬂexibility

Otávio Santana @otaviosantana Data Mapper Data Mapping @Entity public class
Person { @Id @GeneratedValue(strategy = AUTO) Long id; String name; LocalDate birthday; @ManyToOne List<Address> address; … } impedance mismatch centralize mapper responsibility public class PersonRowMapper implements RowMapper<Person> { @Override public Person mapRow(ResultSet rs, int rowNum) throws SQLException { Person person = new Person(); person.setId(rs.getInt("ID")); return person; } }

Otávio Santana @otaviosantana Data Access Object Data Mapping public interface
PersonDAO { List<Person> getAll(); void update(Person person); void delete(Person person); void insert(Person person); } Centralize Data operations impedance mismatch

Otávio Santana @otaviosantana Active Record Data Mapping @Entity public class
Person extends PanacheEntity { public String name; public LocalDate birthday; public List<Address> addresses; } Person person =...; // persist it person.persist(); List<Person> people = Person.listAll(); // finding a specific person by ID person = Person.findById(personId); SOLID breaking Higher domain's responsibility

Otávio Santana @otaviosantana Repository Data Mapping Far from database Domain
oriented @Entity public class Person { private @Id Long id; private @Column String name; private @Column LocalDate birthday; private @ManyToOne List<Address> addresses; } public interface PersonRepository extends <Person, String> {} Person person =...; // persist it repository.save(person); List<Person> people = repository.findAll(); // finding a specific person by ID person = repository.findById(personId);

Otávio Santana @otaviosantana Database integration Data Mapping Client Database Database
Client Mapper DAO Repository Data mapping and conversion

Otávio Santana @otaviosantana Flexibility vs Complexity, Use with Caution Database
Layers

Otávio Santana @otaviosantana DTO Entity Resource DTO Flexibility vs Complexity,
Use with Caution Database

Otávio Santana @otaviosantana Application NoSQL database Java Framework Boilerplate Communications
There is no standard Particular behavior matters Challenges of Persistence In the Java Landscape

Otávio Santana @otaviosantana Persistence Frameworks There is no standard! Driver
Proximity Types & Trade-offs in Frameworks Usability Mapper Agnostic Specific Executability Legibility Declarative Imperative Reflectionless Reflection

Otávio Santana @otaviosantana Goals Paradigms Business Isolation Performance

Otávio Santana @otaviosantana The rules of Software Architecture Everything in
Software Architecture is a trade-off Why is more important than How

Otávio Santana @otaviosantana key value key key key value value
value Wide-Column Graph Document Key Value NoSQL types Deﬁned by structure

Otávio Santana @otaviosantana Database metrics Biz. transactions Oversized / downsized
Invalid/stale connections Apps on await state Fixed cache size No cache usage Consistency impacts w/ distributed cache Complex mapping Auto-generated schemas On-prem x Cloud Many NoSQL types SQL x NoSQL x NewSQL Eager x Lazy loading N+1 Problem Hard to change db types Persistence Conﬁg. Data storage Data Manipulation Cache Conn. Pool Framework

Otávio Santana @otaviosantana Apollo Key-value Structure Ares Love Beauty War
Sun Aphrodite Key Value

Otávio Santana @otaviosantana Criteria Key-value database Relational database Data structure
Key-Value Pairs Tables with rows/columns Query flexibility Limited (Lookup by Key) Complex (SQL queries) Scalability Excellent Good Schema flexibility Limited Highly flexible Relationships Minimal Richly defined ACID compliance Varied (Depends on DB) Strong Key-value Key-value vs. Relational database

Otávio Santana @otaviosantana Apollo Aphrodite Ares Kratos Duty Dead Gods
Love, happy Sun War 13 Color Sword Row-key Columns Wide-Column Structure Duty Duty weapon

Otávio Santana @otaviosantana Criteria Wide-column database Relational database Data Structure
Columns within Rows Tables with Rows/Columns Query Flexibility Flexible Complex (SQL Queries) Scalability Excellent Good Schema Flexibility High Highly Flexible Relationships Limited Richly Defined ACID Compliance Varies (Depends on DB) Strong Wide-Column Wide-Column vs. Relational database

Otávio Santana @otaviosantana { "name":"Diana", "duty":[ "Hunt", "Moon", "Nature" ],
"siblings":{ "Apollo":"brother" } } Document Structure

Otávio Santana @otaviosantana Criteria Document database Relational database Data structure
Structured documents Tables with rows/columns Query flexibility High (Document-level) Complex (SQL queries) Scalability Excellent Good Schema flexibility High flexible Not flexible Relationships Limited (Embedded) Richly defined ACID compliance Strong Strong Document Document vs. Relational database

Otávio Santana @otaviosantana Apollo Ares Kratos was killed by was
killed by killed killed Graph Structure

Otávio Santana @otaviosantana Criteria Graph database Relational database Data structure
Nodes and relationships Tables with rows/columns Query flexibility Excellent Complex (SQL queries) Scalability Good Good Schema flexibility Moderate Highly flexible Relationships Core strength Core strength ACID compliance Strong Strong Graph Graph vs. Relational database

Otávio Santana @otaviosantana Database architecture Developer perspective Flexibility vs Scalability
Database Replication Partitioning Schemaless vs schema Normalization vs Denormalization

Otávio Santana @otaviosantana Scalability vs Flexibility Query and speed Scalability
Flexibility key-value Wide-Column Document Graph Time-series

Otávio Santana @otaviosantana Masterless Database replication

Otávio Santana @otaviosantana Master-slave (leader-follow) Database replication

Otávio Santana @otaviosantana Partitioning Type Characteristics Benefits Considerations Hash-based Partitioning
Data distributed based on hash function Even data distribution Limited range queries Range-based Partitioning Data divided by predefined value ranges Suitable for time-based data Data skew in uneven ranges Directory-based Partitioning Central directory maps data to partitions Control over data distribution Dependency on directory performance Composite Partitioning Combination of multiple partitioning strategies Adaptable to complex data models Complexity in design and management Partitioning Modeling impact

Otávio Santana @otaviosantana Aspect Schema Approach Schemaless Approach Structure Flexibility
Rigid structure prescribed for data consistency Dynamic structure, adaptable to changing data needs Data Evolution It may require schema changes for evolving data Easily accommodates evolving and diverse data Read Performance Optimized for complex queries with joins Maximized read efficiency due to denormalization Write Performance Affected by complex joins and constraints Improved write efficiency due to simplified joins Use Case Examples Financial systems, ERP applications Social media platforms, content-sharing platforms Schema vs. Schemaless Data structure

Otávio Santana @otaviosantana Aspect Normalization Denormalization Goal Reduce data redundancy
and ensure data integrity Enhance read performance and simplify data retrieval Data structure Split data into related tables Combine data into fewer tables Joins Often requires complex joins for data retrieval Minimizes or eliminates joins for faster reads Storage efficiency This may lead to better storage optimization This can result in higher storage consumption Insert, update anomalies Minimized due to distributed data Insert and update anomalies may arise Query performance Might suffer due to frequent joins Generally faster query performance due to denormalization Write performance Writes can be faster due to fewer tables Might be affected due to denormalization Normalization vs. Denormalization Data structure

Otávio Santana @otaviosantana Quiz Time

Otávio Santana @otaviosantana Jakarta EE Overview

Otávio Santana @otaviosantana Jakarta EE

Otávio Santana @otaviosantana Microproﬁle

Otávio Santana @otaviosantana Bean Validation Entity Conversor to/from Entity Jakarta
Validation Framework integration Is valid? Database Exception

Otávio Santana @otaviosantana Lab Time Bean Validation

Otávio Santana @otaviosantana CDI Request Request Request Request Request Request
Request Request conversation conversation session session Application

Otávio Santana @otaviosantana CDI Resource Description and Use in a
NoSQL Application Injection Automatically injects dependencies into managed beans. Qualifier Differentiates between different NoSQL implementations. Produces Controls for the creation of NoSQL-related bean instances. Disposes Manages the disposal of NoSQL-related resources. Event Enables communication of NoSQL data changes across components. Decorator Extends or modifies the behavior of NoSQL-related methods. Interceptor Intercepts and modifies NoSQL interaction method calls. CDI Features

Otávio Santana @otaviosantana Lab Time CDI

Otávio Santana @otaviosantana JAX-RS get post put delete Response storage
Application server

Otávio Santana @otaviosantana JAX-RS Several communications

Otávio Santana @otaviosantana Lab Time JAX-RS

Otávio Santana @otaviosantana BaseDocument baseDocument = new BaseDocument(); baseDocument.addAttribute(name, value);
Document document = new Document(); document.append(name, value); JsonObject jsonObject = JsonObject.create(); jsonObject.put(name, value); ODocument document = new ODocument(“collection”); document.ﬁeld(name, value); Jakarta NoSQL Main motivation

Otávio Santana @otaviosantana @Entity record Book(@Id String id, @Column("name") String
name) { } Jakarta NoSQL Main motivation

Otávio Santana @otaviosantana Jakarta NoSQL Template interface @Inject Template template;
template.insert(book);

Otávio Santana @otaviosantana Specializations Particular behavior matters

Otávio Santana @otaviosantana Jakarta Data Motivation

Otávio Santana @otaviosantana @Repository public interface CarRepository extends CrudRepository<Car, Long>
{ List<Car> findByType(CarType type); Optional<Car> findByName(String name); } Jakarta Data Repository

Otávio Santana @otaviosantana @Inject CarRepository repository; ... Car ferrari =
Car.id(10L).name("Ferrari").type(CarType.SPORT); repository.save(ferrari); Jakarta Data Sample code

Otávio Santana @otaviosantana @Repository public interface ProductRepository extends PageableRepository<Product, Long>
{ Page<Car> findByTypeOrderByName(CarType type, Pageable pageable); } PageableRepository Pagination

Otávio Santana @otaviosantana Repository Built-in repository

Otávio Santana @otaviosantana NoSQL databases

Otávio Santana @otaviosantana Use Case Description Caching Speeds up data
retrieval by storing frequently accessed data in memory, reducing the need to fetch it from the backend. Session Management Stores and manages user sessions, enhancing application performance and scalability. Real-time Analytics Powers real-time dashboards and analytics by swiftly processing and aggregating data. Pub/Sub Messaging Facilitates real-time communication between components through publish-subscribe messaging patterns. Redis User cases

Otávio Santana @otaviosantana Redis Do and Don'ts Simplify Schema Identify
Key Patterns Use Contextual Keys Batching Operations Cache-friendly Design Avoid Complex Joins Beware of Over-Normalization Avoid Bloated Values Limit Indexes

Otávio Santana @otaviosantana Redis Clusters

Otávio Santana @otaviosantana Lab Time

Otávio Santana @otaviosantana Apache Cassandra User Cases Use Case Description
Online Retail Managing e-commerce platforms, handling product catalogs, user profiles, and transaction records. Social Media Analytics Monitoring and analyzing social media interactions, tracking trends, and user engagement. Real-Time Analytics Enabling quick analysis of data streams for instant insights, critical in financial and logistics sectors. Logging and Monitoring Centralized storage and analysis of logs and monitoring data from applications and servers.

Otávio Santana @otaviosantana Apache Cassandra Modeling tips Denormalize When Necessary
Design for Query Patterns Choose Optimal Column Families Utilize Secondary Indexes Compression and Bloom Filters Avoid Over-Denormalization Limit Column Count Avoid Overusing Secondary Indexes

Otávio Santana @otaviosantana 1 2 3 4 5 6 7
8 9 11 12 10 Client R1 R2 R3 1 2 3 4 5 6 7 8 9 11 12 10 R4 R5 R6 DC1 DC2 Apache Cassandra Cluster

Otávio Santana @otaviosantana Jim Car: Camaro Age: 32 Carol Color:
Pink Work: Hobby Suzy Team: Bahia Country: USA B[4-8] Jim 1 Carol 13 Suzy 15 A[0-3] D[14-18] C[9-13] Apache Cassandra Partitioning

Otávio Santana @otaviosantana MongoDB User Cases Use Case Description Content
Management Store and manage diverse content types like articles, images, and videos with schema flexibility. Catalog Management Efficiently organize and categorize products or items with changing attributes and metadata. Mobile Applications Provide offline capabilities and sync data seamlessly once online, enhancing user experience. Social Media Platforms Facilitate rapid storage and retrieval of user-generated content, profiles, and social interactions.

Otávio Santana @otaviosantana MongoDB Modeling tips Leverage Embedded Documents Design
Around Use Cases Employ Indexes Judiciously Normalize When Logical Stay Flexible with Arrays Avoid Over-Embedding Steer Clear of Monolithic Documents Don’t Over-Index Beware of the One-Size-Fits-All Approach

Otávio Santana @otaviosantana MongoDB Cluster

Otávio Santana @otaviosantana Neo4J User Cases Use Case Description Social
Networks Modeling user profiles, friendships, and interactions for effective social networking platforms. Recommendation Engines Powering personalized recommendations by analyzing connections and preferences. Knowledge Graphs Organizing and querying complex relationships in fields like healthcare, finance, and research. Fraud Detection Uncovering hidden patterns and connections indicative of fraudulent activities. Network Analysis Analyzing intricate relationships in data networks, such as transportation and communication.

Otávio Santana @otaviosantana Neo4J Modeling tips Avoid Over-Reliance on Joins
Steer Clear of Over-Connecting Nodes Don’t Overcomplicate Graph Design Beware of Property Index Overuse Embrace Relationship-Driven Design Craft Efﬁcient Traversal Paths Use Labels and Types Wisely Leverage Indexing for Nodes Utilize Property Indexes

Otávio Santana @otaviosantana Leader Follower Follower Read Replica Read Replica
Neo4J Modeling tips

Otávio Santana @otaviosantana Conclusions Final considerations

Otávio Santana @otaviosantana Books recommendations NoSQL introduction https://bpbonline.com/products/java-persistence-with-nosql coupon code:
Otavio Discount: 20%

Otávio Santana @otaviosantana Books recommendations NoSQL introduction

Otavio Santana Software Engineer & Architect @otaviojava Java Champion, Oracle
ACE JCP-EC-EG-EGL Apache and Eclipse Committer Jakarta EE and MicroProﬁle Duke Choice Award JCP Award Book and blog writer Who am I?

Elias Nogueira Principal Software Engineer in Test @eliasnogueira Java Champion,
Oracle ACE JCP-EC-EG-EGL Apache and Eclipse Committer Jakarta EE and MicroProﬁle Duke Choice Award JCP Award Book and blog writer Who am I?

Thank you! Otávio Santana Software Engineer & Architect @otaviojava

Empowering Java Applications with NoSQL: A Hand...

Empowering Java Applications with NoSQL: A Hands-On Workshop

More Decks by Otavio Santana

Other Decks in Science

Featured

Transcript