Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Empowering Java Applications with NoSQL: A Hands-On Workshop

Empowering Java Applications with NoSQL: A Hands-On Workshop

Dive into the dynamic world of NoSQL databases and discover how they can revolutionize your Java applications in our free, interactive workshop, “Empowering Java Applications with NoSQL.”

Designed for senior engineers, architects, and Java developers with a keen interest in NoSQL databases, this workshop brings to life the cutting-edge concepts and techniques from the acclaimed book by Otavio Santana “Java Persistence with NoSQL: Unleashing the Power of NoSQL: Integrating MongoDB, Cassandra, Neo4J, Redis, and more in Enterprise Java Architecture.”

Takeaways:
A deep understanding of NoSQL databases, including their types, advantages, and when to use them over traditional SQL databases.
Practical experience in integrating MongoDB, Cassandra, Neo4J, and Redis with Java applications, utilizing real-world scenarios.
Mastery of using Jakarta EE and Microprofile to enhance Java applications, ensure compatibility, and maximize performance.
Skills in implementing polyglot persistence within enterprise Java architecture, enabling optimized data storage and retrieval strategies.
Expertise in data modeling, querying, and applying advanced NoSQL techniques, such as transaction management and optimization, to Java applications.
Confidence in architecting and developing scalable, flexible, and robust Java applications that leverage NoSQL databases’ unique capabilities.
The ability to navigate the complexities of modern application development, using a comprehensive toolkit to efficiently tackle data challenges.

Otavio Santana

April 20, 2024
Tweet

More Decks by Otavio Santana

Other Decks in Science

Transcript

  1. Otávio Santana @otaviosantana Welcome Topics Persistence introduction ◦ Persistence challenges

    ◦ Persistence Universe Jakarta EE ◦ JAR-RS ◦ CDI ◦ Bean Validation NoSQL databases ◦ Key-value (Redis) ◦ Wide-Column (Apache Cassandra) ◦ Document (MongoDB) ◦ Graph (Neo4J) [Bonus]
  2. Otávio Santana @otaviosantana Requirements https://o-s-expert.github.io/polyglot-nosql/ ✔ JDK 17 + installed

    ✔ Modern IDE (Intellij, VSCode, etc...) ✔ Git ✔ Docker ✔ Docker Compose Setup
  3. Otávio Santana @otaviosantana Welcome Introduce yourself ✔ What is your

    name? ✔ Where are you from? ✔ What is your Java and Database experience? ✔ Fun fact about you.
  4. Otávio Santana @otaviosantana Why Do Modern Applications Need Data Storage?

    The new opportunities The current opportunity It's where the state is Business rules (database?)
  5. Otávio Santana @otaviosantana Joins vs. data volume Everything has trade-offs,

    including normalization https://medium.com/@benmorel/to-join-or-not-to-join-bba9c1377c10
  6. Otávio Santana @otaviosantana key value key key key value value

    value Wide-Column Graph Document Key Value NoSQL Database SQL Database Database solutions Polyglot Persistence
  7. Otávio Santana @otaviosantana Development lifecycle very often starts with immature

    data structures Changing a database type in existing applications is complex and expensive. Over time, maintenance of data and schema evolution gets challenging First thing we do Persistence landscape Change Maintenance State of affairs Evolutionary Data
  8. Otávio Santana @otaviosantana Application (Object Oriented Language) Mismatch Database (Relational)

    Object Challenges in a database land Different paradigms: Apps x DBMS Tables
  9. Otávio Santana @otaviosantana Mapping Mismatch 1 * Addresses Inheritance Polymorphism

    Encapsulation Types Normalization Denormalization Structure Application Database key val ue key key key valu e val ue val ue wide-Column Graph Document Key Value Challenges in a database land Different paradigms: Apps x DBMS
  10. Otávio Santana @otaviosantana Driver Data Mapper Database integration Data Mapping

    Handling Data Integration in Java Active Record Repository
  11. Otávio Santana @otaviosantana Driver Database integration Data Mapping Database Application

    Data Mapper Active Record Repository DAO Data-oriented Programing Object Oriented Programming
  12. Otávio Santana @otaviosantana Database-oriented Programming 4 Principles Database Data-oriented Programing

    1. Separating code (behavior) from data. 2. Representing data with generic data structures. 3. Treating data as immutable. 4. Separating data schema from data representation.
  13. Otávio Santana @otaviosantana Object Oriented Programming Principles Application Object Oriented

    Programming 1. Expose behavior 2. Hide data 3. Explore Abstraction 4. Use of layers and Modules
  14. Otávio Santana @otaviosantana Driver Data Mapping try(Connection conn = DriverManager.getConnection(DB_URL,

    USER, PASS){ Statement stmt = conn.createStatement(); ResultSet rs = stmt.executeQuery(QUERY);) { // Extract data from result set while (rs.next()) { // Retrieve by column name System.out.print("ID: " + rs.getInt("id")); System.out.print(",name: " + rs.getInt("name")); System.out.print(",birthday: " + rs.getString("birthday")); System.out.println(",city: " + rs.getString("city")); System.out.println(",street: " + rs.getString("street")); …} } Complex relation w/ business logic Code flexibility
  15. Otávio Santana @otaviosantana Data Mapper Data Mapping @Entity public class

    Person { @Id @GeneratedValue(strategy = AUTO) Long id; String name; LocalDate birthday; @ManyToOne List<Address> address; … } impedance mismatch centralize mapper responsibility public class PersonRowMapper implements RowMapper<Person> { @Override public Person mapRow(ResultSet rs, int rowNum) throws SQLException { Person person = new Person(); person.setId(rs.getInt("ID")); return person; } }
  16. Otávio Santana @otaviosantana Data Access Object Data Mapping public interface

    PersonDAO { List<Person> getAll(); void update(Person person); void delete(Person person); void insert(Person person); } Centralize Data operations impedance mismatch
  17. Otávio Santana @otaviosantana Active Record Data Mapping @Entity public class

    Person extends PanacheEntity { public String name; public LocalDate birthday; public List<Address> addresses; } Person person =...; // persist it person.persist(); List<Person> people = Person.listAll(); // finding a specific person by ID person = Person.findById(personId); SOLID breaking Higher domain's responsibility
  18. Otávio Santana @otaviosantana Repository Data Mapping Far from database Domain

    oriented @Entity public class Person { private @Id Long id; private @Column String name; private @Column LocalDate birthday; private @ManyToOne List<Address> addresses; } public interface PersonRepository extends <Person, String> {} Person person =...; // persist it repository.save(person); List<Person> people = repository.findAll(); // finding a specific person by ID person = repository.findById(personId);
  19. Otávio Santana @otaviosantana Application NoSQL database Java Framework Boilerplate Communications

    There is no standard Particular behavior matters Challenges of Persistence In the Java Landscape
  20. Otávio Santana @otaviosantana Persistence Frameworks There is no standard! Driver

    Proximity Types & Trade-offs in Frameworks Usability Mapper Agnostic Specific Executability Legibility Declarative Imperative Reflectionless Reflection
  21. Otávio Santana @otaviosantana The rules of Software Architecture Everything in

    Software Architecture is a trade-off Why is more important than How
  22. Otávio Santana @otaviosantana key value key key key value value

    value Wide-Column Graph Document Key Value NoSQL types Defined by structure
  23. Otávio Santana @otaviosantana Database metrics Biz. transactions Oversized / downsized

    Invalid/stale connections Apps on await state Fixed cache size No cache usage Consistency impacts w/ distributed cache Complex mapping Auto-generated schemas On-prem x Cloud Many NoSQL types SQL x NoSQL x NewSQL Eager x Lazy loading N+1 Problem Hard to change db types Persistence Config. Data storage Data Manipulation Cache Conn. Pool Framework
  24. Otávio Santana @otaviosantana Criteria Key-value database Relational database Data structure

    Key-Value Pairs Tables with rows/columns Query flexibility Limited (Lookup by Key) Complex (SQL queries) Scalability Excellent Good Schema flexibility Limited Highly flexible Relationships Minimal Richly defined ACID compliance Varied (Depends on DB) Strong Key-value Key-value vs. Relational database
  25. Otávio Santana @otaviosantana Apollo Aphrodite Ares Kratos Duty Dead Gods

    Love, happy Sun War 13 Color Sword Row-key Columns Wide-Column Structure Duty Duty weapon
  26. Otávio Santana @otaviosantana Criteria Wide-column database Relational database Data Structure

    Columns within Rows Tables with Rows/Columns Query Flexibility Flexible Complex (SQL Queries) Scalability Excellent Good Schema Flexibility High Highly Flexible Relationships Limited Richly Defined ACID Compliance Varies (Depends on DB) Strong Wide-Column Wide-Column vs. Relational database
  27. Otávio Santana @otaviosantana { "name":"Diana", "duty":[ "Hunt", "Moon", "Nature" ],

    "siblings":{ "Apollo":"brother" } } Document Structure
  28. Otávio Santana @otaviosantana Criteria Document database Relational database Data structure

    Structured documents Tables with rows/columns Query flexibility High (Document-level) Complex (SQL queries) Scalability Excellent Good Schema flexibility High flexible Not flexible Relationships Limited (Embedded) Richly defined ACID compliance Strong Strong Document Document vs. Relational database
  29. Otávio Santana @otaviosantana Criteria Graph database Relational database Data structure

    Nodes and relationships Tables with rows/columns Query flexibility Excellent Complex (SQL queries) Scalability Good Good Schema flexibility Moderate Highly flexible Relationships Core strength Core strength ACID compliance Strong Strong Graph Graph vs. Relational database
  30. Otávio Santana @otaviosantana Database architecture Developer perspective Flexibility vs Scalability

    Database Replication Partitioning Schemaless vs schema Normalization vs Denormalization
  31. Otávio Santana @otaviosantana Scalability vs Flexibility Query and speed Scalability

    Flexibility key-value Wide-Column Document Graph Time-series
  32. Otávio Santana @otaviosantana Partitioning Type Characteristics Benefits Considerations Hash-based Partitioning

    Data distributed based on hash function Even data distribution Limited range queries Range-based Partitioning Data divided by predefined value ranges Suitable for time-based data Data skew in uneven ranges Directory-based Partitioning Central directory maps data to partitions Control over data distribution Dependency on directory performance Composite Partitioning Combination of multiple partitioning strategies Adaptable to complex data models Complexity in design and management Partitioning Modeling impact
  33. Otávio Santana @otaviosantana Aspect Schema Approach Schemaless Approach Structure Flexibility

    Rigid structure prescribed for data consistency Dynamic structure, adaptable to changing data needs Data Evolution It may require schema changes for evolving data Easily accommodates evolving and diverse data Read Performance Optimized for complex queries with joins Maximized read efficiency due to denormalization Write Performance Affected by complex joins and constraints Improved write efficiency due to simplified joins Use Case Examples Financial systems, ERP applications Social media platforms, content-sharing platforms Schema vs. Schemaless Data structure
  34. Otávio Santana @otaviosantana Aspect Normalization Denormalization Goal Reduce data redundancy

    and ensure data integrity Enhance read performance and simplify data retrieval Data structure Split data into related tables Combine data into fewer tables Joins Often requires complex joins for data retrieval Minimizes or eliminates joins for faster reads Storage efficiency This may lead to better storage optimization This can result in higher storage consumption Insert, update anomalies Minimized due to distributed data Insert and update anomalies may arise Query performance Might suffer due to frequent joins Generally faster query performance due to denormalization Write performance Writes can be faster due to fewer tables Might be affected due to denormalization Normalization vs. Denormalization Data structure
  35. Otávio Santana @otaviosantana Bean Validation Entity Conversor to/from Entity Jakarta

    Validation Framework integration Is valid? Database Exception
  36. Otávio Santana @otaviosantana CDI Request Request Request Request Request Request

    Request Request conversation conversation session session Application
  37. Otávio Santana @otaviosantana CDI Resource Description and Use in a

    NoSQL Application Injection Automatically injects dependencies into managed beans. Qualifier Differentiates between different NoSQL implementations. Produces Controls for the creation of NoSQL-related bean instances. Disposes Manages the disposal of NoSQL-related resources. Event Enables communication of NoSQL data changes across components. Decorator Extends or modifies the behavior of NoSQL-related methods. Interceptor Intercepts and modifies NoSQL interaction method calls. CDI Features
  38. Otávio Santana @otaviosantana BaseDocument baseDocument = new BaseDocument(); baseDocument.addAttribute(name, value);

    Document document = new Document(); document.append(name, value); JsonObject jsonObject = JsonObject.create(); jsonObject.put(name, value); ODocument document = new ODocument(“collection”); document.field(name, value); Jakarta NoSQL Main motivation
  39. Otávio Santana @otaviosantana @Repository public interface CarRepository extends CrudRepository<Car, Long>

    { List<Car> findByType(CarType type); Optional<Car> findByName(String name); } Jakarta Data Repository
  40. Otávio Santana @otaviosantana @Inject CarRepository repository; ... Car ferrari =

    Car.id(10L).name("Ferrari").type(CarType.SPORT); repository.save(ferrari); Jakarta Data Sample code
  41. Otávio Santana @otaviosantana @Repository public interface ProductRepository extends PageableRepository<Product, Long>

    { Page<Car> findByTypeOrderByName(CarType type, Pageable pageable); } PageableRepository Pagination
  42. Otávio Santana @otaviosantana Use Case Description Caching Speeds up data

    retrieval by storing frequently accessed data in memory, reducing the need to fetch it from the backend. Session Management Stores and manages user sessions, enhancing application performance and scalability. Real-time Analytics Powers real-time dashboards and analytics by swiftly processing and aggregating data. Pub/Sub Messaging Facilitates real-time communication between components through publish-subscribe messaging patterns. Redis User cases
  43. Otávio Santana @otaviosantana Redis Do and Don'ts Simplify Schema Identify

    Key Patterns Use Contextual Keys Batching Operations Cache-friendly Design Avoid Complex Joins Beware of Over-Normalization Avoid Bloated Values Limit Indexes
  44. Otávio Santana @otaviosantana Apache Cassandra User Cases Use Case Description

    Online Retail Managing e-commerce platforms, handling product catalogs, user profiles, and transaction records. Social Media Analytics Monitoring and analyzing social media interactions, tracking trends, and user engagement. Real-Time Analytics Enabling quick analysis of data streams for instant insights, critical in financial and logistics sectors. Logging and Monitoring Centralized storage and analysis of logs and monitoring data from applications and servers.
  45. Otávio Santana @otaviosantana Apache Cassandra Modeling tips Denormalize When Necessary

    Design for Query Patterns Choose Optimal Column Families Utilize Secondary Indexes Compression and Bloom Filters Avoid Over-Denormalization Limit Column Count Avoid Overusing Secondary Indexes
  46. Otávio Santana @otaviosantana 1 2 3 4 5 6 7

    8 9 11 12 10 Client R1 R2 R3 1 2 3 4 5 6 7 8 9 11 12 10 R4 R5 R6 DC1 DC2 Apache Cassandra Cluster
  47. Otávio Santana @otaviosantana Jim Car: Camaro Age: 32 Carol Color:

    Pink Work: Hobby Suzy Team: Bahia Country: USA B[4-8] Jim 1 Carol 13 Suzy 15 A[0-3] D[14-18] C[9-13] Apache Cassandra Partitioning
  48. Otávio Santana @otaviosantana MongoDB User Cases Use Case Description Content

    Management Store and manage diverse content types like articles, images, and videos with schema flexibility. Catalog Management Efficiently organize and categorize products or items with changing attributes and metadata. Mobile Applications Provide offline capabilities and sync data seamlessly once online, enhancing user experience. Social Media Platforms Facilitate rapid storage and retrieval of user-generated content, profiles, and social interactions.
  49. Otávio Santana @otaviosantana MongoDB Modeling tips Leverage Embedded Documents Design

    Around Use Cases Employ Indexes Judiciously Normalize When Logical Stay Flexible with Arrays Avoid Over-Embedding Steer Clear of Monolithic Documents Don’t Over-Index Beware of the One-Size-Fits-All Approach
  50. Otávio Santana @otaviosantana Neo4J User Cases Use Case Description Social

    Networks Modeling user profiles, friendships, and interactions for effective social networking platforms. Recommendation Engines Powering personalized recommendations by analyzing connections and preferences. Knowledge Graphs Organizing and querying complex relationships in fields like healthcare, finance, and research. Fraud Detection Uncovering hidden patterns and connections indicative of fraudulent activities. Network Analysis Analyzing intricate relationships in data networks, such as transportation and communication.
  51. Otávio Santana @otaviosantana Neo4J Modeling tips Avoid Over-Reliance on Joins

    Steer Clear of Over-Connecting Nodes Don’t Overcomplicate Graph Design Beware of Property Index Overuse Embrace Relationship-Driven Design Craft Efficient Traversal Paths Use Labels and Types Wisely Leverage Indexing for Nodes Utilize Property Indexes
  52. Otavio Santana Software Engineer & Architect @otaviojava Java Champion, Oracle

    ACE JCP-EC-EG-EGL Apache and Eclipse Committer Jakarta EE and MicroProfile Duke Choice Award JCP Award Book and blog writer Who am I?
  53. Elias Nogueira Principal Software Engineer in Test @eliasnogueira Java Champion,

    Oracle ACE JCP-EC-EG-EGL Apache and Eclipse Committer Jakarta EE and MicroProfile Duke Choice Award JCP Award Book and blog writer Who am I?