Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cypher Sleuthing: Taking your Skills to the Next Level

Cypher Sleuthing: Taking your Skills to the Next Level

We know the basics of Cypher and can navigate a graph, but how do we take our skills to the next level? This session will navigate more complex concepts for writing clean, simple, and performant queries with Cypher. Come join us to take the next step for Cypher expertise!

Github repository: https://github.com/JMHReif/cypher-sleuthing

10f2b035856eef2b59c02267e3ee9e63?s=128

Jennifer Reif

June 18, 2021
Tweet

Transcript

  1. Jennifer Reif Developer Relations Engineer jennifer.reif@neo4j.com @JMHReif linkedin.com/in/jmhreif/ Cypher Sleuthing:

    Taking Your Skills to the Next Level
  2. Tackling a solution… • Trial-and-error • At the mercy of

    the language • Whack-a-mole fixes • What if…. Credit: unsplash.com
  3. Understanding is the key • How to wield the tool

    brings mastery • Knowing when and how —> • Efficiency • Better results Credit: unsplash.com
  4. Data set • Venmo payments • 2018-07-26 -> 2018-08-07 (~2weeks

    of data) • Venmo retired API summer of 2018
  5. Venmo data model

  6. Cypher CASE

  7. Pitfalls (mine, at least) • Logic like other languages (e.g.

    Java) • Variable-setting • Complex conditionals Credit: unsplash.com
  8. Examples //Bad syntax LOAD CSV…AS row MERGE (c:Company {companyId: row.Id})

    WITH row, c, CASE row.type WHEN 'P' THEN row.type = 'Public' WHEN 'R' THEN row.type = 'Private' WHEN 'G' THEN row.type = 'Government' ELSE row.type = 'Other' END SET c.businessType = row.type RETURN * //Bad syntax 2 LOAD CSV…AS row WITH row, CASE row.BusinessType WHEN 'P' THEN type = 'Public' WHEN 'R' THEN type = 'Private' WHEN 'G' THEN type = 'Government' ELSE type = 'Other' END RETURN row.CompanyId, row.CompanyName, type
  9. Correct Logic • 1. Expression comparison against multiple values MATCH

    (p:Person)-[r:IS_MANAGED_BY]->(m:Manager)-[r2:OVERSEES]->(d:Department) RETURN p.name, CASE p.role WHEN 'management' THEN d.departmentPhone WHEN 'business' THEN p.businessPhone WHEN 'technical' THEN p.emailAddress ELSE d.departmentEmail END as personContact • 2. Multiple conditional statements expressed MATCH (p:Person) RETURN p.name, CASE WHEN dateHired is null THEN 'candidate' WHEN dateHired > date('2018–07–24') THEN 'newHire' ELSE 'employee' END as personStatus, CASE WHEN dateFired is null THEN dateHired WHEN dateHired is null THEN entryDate ELSE 'n/a' END as leadDate
  10. Example MATCH (u:User)-[:SENDS]->(:Payment)-[:PAID_USING]->(a:Application) WITH u, a, CASE a.name WHEN 'Venmo

    for iPhone' THEN 'iPhone' WHEN 'Venmo for Android' THEN 'Android' ELSE 'unknown' END as userPhoneType SET u.phoneType = userPhoneType RETURN u.displayName, userPhoneType LIMIT 10;
  11. Cypher Temporals

  12. Pitfalls • Date format • ISO 8601 • Pesky literal

    'T' • Durations • Precision • Groups Credit: unsplash.com
  13. Date format • ISO 8601 standard • Date: 2021-06-14 •

    Datetime: 2021-06-14T19:04:18Z • Literal 'T' between date and time values Credit: unsplash.com
  14. Translating to ISO 8601 • Cypher accepts: • ISO 8601

    • Strings in ISO 8601 format • What if you have ANYTHING else? • Error!
  15. APOC to the rescue!

  16. What is APOC? • Utility library for Neo4j • Widely

    used, broadly applicable • 550+ procedures and functions Credit: unsplash.com
  17. Epoch time • Unix system date/time • Processes, logs, etc.

    • Seconds since 1970-01-01T00:00:00Z • Example: 1623958200 (end time for this session) • Handled with apoc.date.format() or apoc.date.toISO8601() https://www.epochconverter.com/
  18. Examples WITH 1623958200 as sessionEnd RETURN apoc.date.format(sessionEnd, "s", "yyyy-MM-dd'T'HH:mm:ss") as

    formattedString; WITH 1623958200 as sessionEnd RETURN apoc.date.format(sessionEnd, "s", "yyyy-MM-dd") as formattedDateString; WITH 1623958200 as sessionEnd RETURN apoc.date.toISO8601(sessionEnd,"s") as converted;
  19. Date Strings • Process: • Temporal string -> ISO 8601

    string -> Neo4j temporal • Any temporal string with specified format • Handled with apoc.date.convertFormat() or apoc.temporal.toZonedTemporal()
  20. Examples WITH '2021-06-17 19:00:00' as sessionStart RETURN apoc.date.convertFormat(sessionStart,"yyyy-MM-dd HH:mm:ss", 'iso_date_time')

    as formattedString; WITH '2021/06/17' as sessionDay RETURN apoc.date.convertFormat(sessionDay,”yyyy/MM/dd”,'iso_date') as formattedDateString; WITH 'Thu Jun 17 19:30:00 -0600 2021' as sessionEnd RETURN apoc.temporal.toZonedTemporal(sessionEnd,'EEE LLL dd HH:mm:ss Z yyyy') as cypherDatetime;
  21. Cypher Temporal Durations

  22. Durations • Distance in time measurements • Literal ‘P’ and

    ’T’ in syntax • Precision - months, days, seconds • P1M3D60000S • Duration functions (between, inMonths, inDays, inSeconds) Credit: unsplash.com
  23. Examples MATCH (p:Payment) WHERE p.dateComplete IS NULL OR p.dateComplete >

    p.dateCreated + duration('P1D') RETURN p.paymentId, p.status, p.dateCreated, p.dateComplete LIMIT 10; MATCH (u:User)-[:SENDS]->(p:Payment) WITH u, p ORDER BY p.dateCreated WITH u, collect(p.dateCreated) as payDates RETURN u.displayName, u.dateJoined, payDates[0], duration.between(u.dateJoined, payDates[0]) as firstUseTime ORDER BY firstUseTime LIMIT 10; WITH datetime(‘2021-06-17T12:00:00’) as nodesStart RETURN duration.between(nodesStart, datetime()) as contentPresented;
  24. Duration conversions • Duration functions: inMonths, inDays, inSeconds • Most

    components -> whole values only, no remainders • Convert between units: functions + components • Component groups
  25. Examples MATCH (u:User)-[:SENDS]->(p:Payment) WITH u, p ORDER BY p.dateCreated WITH

    u, collect(p.dateCreated) as payDates RETURN u.displayName, u.dateJoined, payDates[0], duration.inMonths(u.dateJoined, payDates[0]).years as years, duration.inMonths(u.dateJoined, payDates[0]).monthsOfYear as months ORDER BY years DESC, months DESC LIMIT 10; WITH datetime(‘2021-06-17T12:00:00’) as nodesStart RETURN duration.inSeconds(nodesStart, datetime()).hours as hours, duration.inSeconds(nodesStart, datetime()).minutesOfHour as minutes;
  26. Eager operator

  27. What is Eager? • Eager loading • Consistency and conflicts

    • Operations occur to all rows before continuing • Avoid read/write conflicts Credit: unsplash.com
  28. What’s the difference? Non-Eager Eager Row-by-row Operation-by-operation

  29. Does it matter? • Low memory/heap • Large dataset •

    Query performance Credit: unsplash.com
  30. Example //Eager PROFILE WITH 'https://raw.githubusercontent.com/JMHReif/ nodes2021-aura-training/main/venmo_demo.csv' as file LOAD CSV

    WITH HEADERS FROM file AS line MERGE (from:User {userId: line.`payment.actor.id`}) WITH line, from MATCH (pay:Payment {paymentId: line.`payment.id`}) MERGE (from)-[r:SENDS]->(pay) RETURN count(*); //Non-eager PROFILE WITH 'https://raw.githubusercontent.com/JMHReif/ nodes2021-aura-training/main/venmo_demo.csv' as file LOAD CSV WITH HEADERS FROM file AS line MERGE (from:User {userId: line.`payment.actor.id`}) WITH line, from MATCH (pay:Payment {paymentId: line.`payment.id`}) MERGE (from)-[r:SENDS]->(pay);
  31. Example //Non-eager PROFILE WITH 'https://raw.githubusercontent.com/JMHReif/ nodes2021-aura-training/main/venmo_demo.csv' as file LOAD CSV

    WITH HEADERS FROM file AS line MERGE (app:Application {applicationId: line.`app.id`}) WITH line, app MERGE (pay:Payment {paymentId: line.`payment.id`}) WITH line, app, pay MERGE (pay)-[r2:PAID_USING]->(app); //Eager PROFILE WITH 'https://raw.githubusercontent.com/JMHReif/ nodes2021-aura-training/main/venmo_demo.csv' as file LOAD CSV WITH HEADERS FROM file AS line MERGE (app:Application {applicationId: line.`app.id`}) WITH line, app MERGE (pay:Payment {paymentId: line.`payment.id`}) WITH line, app, pay MATCH (p:Payment {paymentId: line.`payment.id`}) MERGE (p)-[r2:PAID_USING]->(app);
  32. How to avoid eager? • Separate operations • Avoids situations

    where conflicts might occur • Use PROFILE/EXPLAIN on queries • Having trouble? Ask for help! • dev.neo4j.com/forum -> Cypher channel
  33. Recap!

  34. CASE reminders • Uses: • Compare against values • Multiple

    conditionals • NOT: • Programming language logic • Variable-setting Credit: unsplash.com
  35. Temporals reminders • Formats: • Temporal/string in ISO 8601 •

    Other must be translated • Durations: • Groupings (months, days, seconds) • Component groups for translations Credit: unsplash.com
  36. Eager reminders • Data set: • Large set • Heavy

    processing (i.e. cleaning, operations, etc) • AND queries slow • To do: • Separate operations • PROFILE/EXPLAIN is your friend! Credit: unsplash.com
  37. Resources • CASE statement: • Blog post: jennifer-reif.medium.com/cypher-sleuthing-the-case-statement-3e47f5edd38b • Cypher

    temporal: • Blog post: neo4j.com/developer-blog/cypher-sleuthing-dealing-with-dates-part-1/ • Blog post: neo4j.com/developer-blog/cypher-sleuthing-dealing-with-dates-part-2/ • Eager operator: • Blog post: medium.com/neo4j/cypher-sleuthing-the-eager-operator-84a64d91a452 • Ask for help! • dev.neo4j.com/forum -> Cypher channel Jennifer Reif Email: jennifer.reif@neo4j.com Twitter: @JMHReif LinkedIn: linkedin.com/in/jmhreif Credit: unsplash.com