Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Delegate, Automate, Dominate: Putting Graph Tech to Work for You to Unlock Hidden Insights and Opportunities

Delegate, Automate, Dominate: Putting Graph Tech to Work for You to Unlock Hidden Insights and Opportunities

Co-presenter: Mark Heckler
Different database technologies optimize for different uses. Graph databases excel in discovering relationships, known or unknown, within vast sets of data and can help unlock value from overlooked or underutilized sources. Join the presenters in this session to discover what consideration make a dataset a candidate for graph storage and analysis. You'll also learn tips and tricks for data ingestion and structuring while gaining insights on how to build APIs that optimize for meaningful analysis of data relationships. Likewise, you'll learn how to delegate tasks to tools, automate essential but non-critical path functions, and dominate your domain with actionable insights that unlock your data's full value.

Jennifer Reif

June 08, 2022
Tweet

More Decks by Jennifer Reif

Other Decks in Technology

Transcript

  1. Jennifer Reif
    Email: [email protected]
    Twitter: @JMHReif
    LinkedIn: linkedin.com/in/jmhreif
    Github: github.com/JMHReif
    Website: jmhreif.com
    Delegate, Automate, Dominate
    Putting Graph Tech to Work for You to Unlock Hidden Insights
    and Opportunities
    Mark Heckler
    Email: [email protected]
    Twitter: @mkheck
    LinkedIn: linkedin.com/in/markheckler
    Github: github.com/mkheck
    Website: thehecklers.com

    View Slide

  2. Who Am I?
    • Developer + Advocate

    • Continuous learner

    • Technical content writer

    • Conference speaker

    • Other: geek

    View Slide

  3. Who Am I?
    • Author

    • Architect & Developer

    • Developer Advocate, Java/JVM Languages

    • Java Champion, Rockstar

    • Kotlin Developer Expert

    • Pilot
    bit.ly/springbootbook

    View Slide

  4. What makes a good graph?

    View Slide

  5. Connected data!
    • Mixed entity types with queries spanning multiple

    • Analyzing connections between entities

    • Changing data models and needs

    • Impacts/Dependencies layers deep

    View Slide

  6. Data set?

    View Slide

  7. Kaggle Netflix
    • Relationship context is important

    • Multiple types of entities connected

    • Kaggle Net
    fl
    ix set + Wikipedia country names

    View Slide

  8. Data model

    View Slide

  9. Import - tips and tricks

    View Slide

  10. Load Productions
    LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/JMHReif/graph-demo-datasets/main/kaggle-net
    fl
    ix/titles.csv" as row
    CALL apoc.merge.node(["Production",apoc.text.capitalize(toLower(row.type))], {productionId: row.id}, {title: row.title, …}, {}) YIELD node as p
    WITH row, p
    CALL {

    MERGE (g:Genre {name: apoc.text.capitalize(genre)})
    MERGE (p)-[r:CATEGORIZED_BY]->(g)
    }
    WITH row, p
    CALL {

    MERGE (c:Country {iso2Code: country})
    MERGE (p)-[r2:PRODUCED_IN]->(c)
    }
    RETURN count(row);
    https://github.com/JMHReif/graph-demo-datasets/blob/main/kaggle-net
    fl
    ix/load-data.cypher

    View Slide

  11. Load Production People
    :auto LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/JMHReif/graph-demo-datasets/main/kaggle-net
    fl
    ix/credits.csv" as row
    WITH row
    CALL {
    WITH row
    MERGE (p:Person {personId: row.person_id})
    SET p.name = row.name
    WITH row, p
    CALL apoc.create.addLabels(p,[apoc.text.capitalize(toLower(row.role))]) YIELD node as person
    RETURN person
    } IN TRANSACTIONS OF 20000 ROWS
    RETURN count(row);
    https://github.com/JMHReif/graph-demo-datasets/blob/main/kaggle-net
    fl
    ix/load-data.cypher

    View Slide

  12. Top to Bottom
    • 2 CSV
    fi
    les

    • Add country name with Wikipedia

    • Use Cypher + APOC magic

    • Start in small pieces

    • :auto IN TRANSACTIONS to batch

    • Multiple statements to conserve memory

    • Can be scripted

    • Also can schedule (using APOC)

    View Slide

  13. Let’s build an API!
    Demo time!

    View Slide

  14. Automate
    • Platforming

    • Con
    fi
    guration

    • Deployment

    • Idempotent

    • Monitoring

    • Management

    View Slide

  15. Actionable insights
    Multiple levels, multiple perspectives
    • Data

    • System of systems

    • Platform

    View Slide

  16. Resources
    • Source code: github.com/HecklerReifCollab/person-service

    • Data set: github.com/JMHReif/graph-demo-datasets/tree/main/kaggle-net
    fl
    ix

    • Neo4j AuraDB: dev.neo4j.com/aura

    • Azure Spring Apps: aka.ms/azurespringapps
    Jennifer Reif
    Email: [email protected]
    Twitter: @JMHReif
    LinkedIn: linkedin.com/in/jmhreif
    Github: GitHub.com/JMHReif
    Website: jmhreif.com
    Mark Heckler
    Email: [email protected]
    Twitter: @mkheck
    LinkedIn: linkedin.com/in/markheckler
    Github: github.com/mkheck
    Website: thehecklers.com

    View Slide