Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Neo4j-Migrations: The lean way of applying transformations to a Graph Database @ JavaSummit 2023

Neo4j-Migrations: The lean way of applying transformations to a Graph Database @ JavaSummit 2023

Neo4j-Migrations gives you an easy way to apply schema changes to Neo4j. Unlike Liquibase, It is almost dependency free and runs right on the Neo4j Java Driver, with no need to work with JDBC. Neo4j-Migrations integrates Spring Data Neo4j with JHipster and has been in production since summer 2020.

Neo4j-Migrations:
* Runs Cypher scripts
* Runs Java-based migrations
* Records all migrations in a simple graph
* Works with multiple databases
* Has extensions for Spring Boot (included in Neo4j Ops Manager) and Quarkus

The most important feature? Neo4j-Migrations comes as a native binary CLI, perfect for use in CI/CD systems.The CLI has everything you need to set up safe CI scenarios.

Come to this session to learn how you can put Neo4j-Migrations to work for you.

Key takeaways:

* You can’t have no schema, even a schema free / less graph database like Neo4j has a schema, it just materialises later
* „Ancient“ knowledge from decades ago („Refactoring Databases“ by Scott Ambler and others) still apply today
* Understand the challenges to offer a program written in Java as an OS native binary powered by GraalVM

Michael Simons

May 25, 2023
Tweet

More Decks by Michael Simons

Other Decks in Programming

Transcript

  1. © 2023 Neo4j, Inc. All rights reserved. © 2023 Neo4j,

    Inc. All rights reserved. Neo4j-Migrations: The lean way of applying refactorings transformations to a Graph Database Michael Simons @rotnroll666(@mastodon.social) Java Champion & Staff Software Engineer at Neo4j 1
  2. © 2023 Neo4j, Inc. All rights reserved. 2 You can’t

    have no scheme. Michaels Hot-Take #1 2 Michael Simons at Java Summit 2023
  3. © 2023 Neo4j, Inc. All rights reserved. Entity relationship diagrams

    (ERD) They come in different forms and levels of abstraction • Conceptual diagrams • Logical diagrams • Physical diagrams While mostly used for relational DBMS, they help also with • General database design and debugging • Some aid in requirements engineering • Creating and patching database schema and content 3 Michael Simons at Java Summit 2023
  4. © 2023 Neo4j, Inc. All rights reserved. Conceptual ERD Definition

    of what exists in a system, not how it manifests itself 4 Michael Simons at Java Summit 2023 https://www.visual-paradigm.com/guide/data-modeling/what-is-entity-relationship-diagram/
  5. © 2023 Neo4j, Inc. All rights reserved. Logical ERD Enrichment

    of the conceptual model 5 Michael Simons at Java Summit 2023 https://www.visual-paradigm.com/guide/data-modeling/what-is-entity-relationship-diagram/
  6. © 2023 Neo4j, Inc. All rights reserved. Physical ERD Concrete

    implementation details 6 Michael Simons at Java Summit 2023 https://www.visual-paradigm.com/guide/data-modeling/what-is-entity-relationship-diagram/
  7. © 2023 Neo4j, Inc. All rights reserved. The logical ERD

    but as property Graph 7 Michael Simons at Java Summit 2023
  8. © 2023 Neo4j, Inc. All rights reserved. The logical ERD

    but as property Graph 8 Michael Simons at Java Summit 2023
  9. © 2023 Neo4j, Inc. All rights reserved. The logical ERD

    but as property Graph 9 Michael Simons at Java Summit 2023
  10. © 2023 Neo4j, Inc. All rights reserved. Forms of physical

    models in Neo4j CREATE CONSTRAINT customer_name FOR (c:Customer) REQUIRE c.name IS NOT NULL; CREATE CONSTRAINT bike_model FOR (b:Bike) REQUIRE b.model IS UNIQUE; 10 Michael Simons at Java Summit 2023
  11. © 2023 Neo4j, Inc. All rights reserved. External physical models

    (Data exchange) 11 Michael Simons at Java Summit 2023
  12. © 2023 Neo4j, Inc. All rights reserved. External physical models

    (ORMs / OGMs and friends) 12 Michael Simons at Java Summit 2023
  13. © 2023 Neo4j, Inc. All rights reserved. 13 All those

    things define some kind of scheme! 13 Michael Simons at Java Summit 2023
  14. © 2023 Neo4j, Inc. All rights reserved. 14 The only

    constant is change Requirements, data itself, processes and actual improvements to software 14 Michael Simons at Java Summit 2023
  15. © 2023 Neo4j, Inc. All rights reserved. History: Refactoring Databases

    by Ambler & Sadalage • Published in 2006 • Evolutionary database design to unblock continuous delivery • Categorizes refactorings • Core concepts: ◦ Versioning of migrations ◦ Checksumming ◦ Tracking migrations (dedicated config table) 15 Michael Simons at Java Summit 2023
  16. © 2023 Neo4j, Inc. All rights reserved. Why evolutionary database

    design? • “Change management” is often more a euphemism for change prevention • “The big schema design” seldom (never?) works out unchanged ➜ Avoid or minimize waste from upfront design and significant rework efforts 16 Michael Simons at Java Summit 2023
  17. © 2023 Neo4j, Inc. All rights reserved. 17 If at

    some point roughly 20 years ago, DBAs sitting on top of the databases wouldn't have been so stuck in the 1990s and earlier processes, #NoSQL wouldn't have been such a big movement at all. Michaels Hot-Take #2 17 Michael Simons at Java Summit 2023
  18. © 2023 Neo4j, Inc. All rights reserved. To be fair:

    Database refactorings are hard • Must maintain behavioral semantics • Must maintain informational semantics • Touches usually a lot more than just a schema ◦ Most often one or more applications ◦ Some reports in a team far, far away 18 Michael Simons at Java Summit 2023
  19. © 2023 Neo4j, Inc. All rights reserved. What is applicable

    to Neo4j? (Refactorings) • Relevant categories ◦ (Some) data quality refactorings ◦ (Some) Referential integrity refactorings ◦ All possible transformations All of them can be expressed with Cypher! 19 Michael Simons at Java Summit 2023
  20. © 2023 Neo4j, Inc. All rights reserved. What is applicable

    to Neo4j? (Core concepts) • Versioning of migrations (refactorings / changesets) ◦ Not applying them multiple times • Put them into actual version control • Checksumming ◦ Making sure changesets are not changed after the fact • Make them validatable • Track them in a schema / configuration table 20 Michael Simons at Java Summit 2023
  21. © 2023 Neo4j, Inc. All rights reserved. Our schema 22

    Michael Simons at Java Summit 2023
  22. © 2023 Neo4j, Inc. All rights reserved. For what did

    we need a database refactoring tool? 1. Spring Data Neo4j (SDN) and JHipster collaboration back in late 2019 (Needed a simple way to create a couple of Nodes and Relationships, basic inventory data) 2. Deriving index and constraint definitions from Neo4j-OGM and applying them to at least 4 different versions of Neo4j 23 Michael Simons at Java Summit 2023
  23. © 2023 Neo4j, Inc. All rights reserved. Existing Java based

    tooling 24 • Flyway (Relational) • Liquibase (Relational + Plugins) • Liquigraph (Graph) ◦ Now obsolete, became Liquibase-Neo4j-Plugin Michael Simons at Java Summit 2023 Conceptual main difference: • Free form scripts vs • Built-in set of refactorings
  24. © 2023 Neo4j, Inc. All rights reserved. 25 We do

    prefer the free form approach, it does align much better with what is necessary and possible within current Neo4j. 25 Michael Simons at Java Summit 2023
  25. © 2023 Neo4j, Inc. All rights reserved. Our requirements •

    Dependency free or as few as possible ◦ Target frameworks often have many things in place, don’t want to bring them more dependency hell ◦ Connection layer of course necessary and ok • Not wrapping Neo4j, the bolt model, its sessions and concepts in JDBC • As little abstractions as possible • As little overhead as possible • Liquigraph / Liquibase-Neo4j felt too alien for Neo4j => Let’s create our own thing! https://github.com/michael-simons/neo4j-migrations 26 Michael Simons at Java Summit 2023
  26. © 2023 Neo4j, Inc. All rights reserved. Why Java?! 27

    • I do know Java inside / out ◦ Less needed experiments ◦ Well known build systems ◦ Boring, in a good way ◦ Modern and versatile, even more so with Java 17 and higher • The driver / connector support for Java is IMHO the most mature Neo4j driver • Fantastic CLI support via PicoCLI • Fantastic native support via GraalVM Michael Simons at Java Summit 2023
  27. © 2023 Neo4j, Inc. All rights reserved. History and feature

    overview • January 2020: ◦ 0.0.1 preview release for JHipster ◦ 0.0.8 Spring Boot starter (needed in JHipster) • February 2021: First external contributor • July 2021: Make validation optional, transactional functions everywhere, full AuraDB support • October / November 2021: Collaboration with JReleaser / Andres Almiray • November 2021: 1.0.0 release published with JReleaser • December 2021: Official Neo4j-Labs project Neo4j-Labs Project • February 2022: Full support for Quarkus, including native mode 28
  28. © 2023 Neo4j, Inc. All rights reserved. History and feature

    overview 29 • June 2022: Catalogs ◦ Primary for indexes and constraints, more to come • August 2022: Built-In database refactorings ◦ Renaming labels / types ◦ Normalizing properties ◦ Merging nodes • September 2022: Annotation processor • November 2022: Repeatable migrations • Future: ◦ Undo for catalog items ◦ Undo with manual scripts Michael Simons at Java Summit 2023
  29. © 2023 Neo4j, Inc. All rights reserved. Distribution https://github.com/michael-simons/neo4j-migrations •

    Core API • Spring Boot Starter • Quarkus Extension • Maven Plugin • CLI (native binaries for Linux, macOS, Windows, noarch (JVM) for everything) ◦ No JVM or Jars required, just one native binary! Read more: Modules 30 Michael Simons at Java Summit 2023
  30. © 2023 Neo4j, Inc. All rights reserved. Installation (CLI only)

    • Via SDKMan! since 1.5.1 sdk install neo4jmigrations • Via Homebrew for macOS • brew install michael-simons/homebrew-neo4j-migrations/neo4j-migrations Read more: Installation; Core-API, Quarkus Extension, Spring Boot Starter and Maven-Plugin are on Maven-Central! 31 Michael Simons at Java Summit 2023
  31. © 2023 Neo4j, Inc. All rights reserved. How to run

    this? • Inside applications ◦ Only inventory data, please ◦ Moving data for tests (resources will be aggregated) ◦ Modern applications are usually running more than one instance: Which one should trigger migrations? • CI/CD (No JVM necessary, self-contained, native binary) Safe passwords (files or environment) • Sidecar • Init container 32 Michael Simons at Java Summit 2023
  32. © 2023 Neo4j, Inc. All rights reserved. © 2023 Neo4j,

    Inc. All rights reserved. 35 Email [email protected] Social media @rotnroll666(@mastodon.social) Profile https://github.com/michael-simons Neo4j-Migrations https://neo4j.com/labs/neo4j-migrations Demos https://github.com/michael-simons/javasummit2023 (Quarkus) https://github.com/michael-simons/nodes2022 (Spring) Tools used Java 17, Quarkus 3 (REST service) Neo4j via Docker httpie for sending requests https://httpie.io jq for parsing JSON https://stedolan.github.io/jq/ Thank you!