Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cypher Sleuthing: Taking your Skills to the Nex...

Cypher Sleuthing: Taking your Skills to the Next Level

We know the basics of Cypher and can navigate a graph, but how do we take our skills to the next level? This session will navigate more complex concepts for writing clean, simple, and performant queries with Cypher. Come join us to take the next step for Cypher expertise!

Github repository: https://github.com/JMHReif/cypher-sleuthing

Jennifer Reif

June 18, 2021
Tweet

More Decks by Jennifer Reif

Other Decks in Technology

Transcript

  1. Tackling a solution… • Trial-and-error • At the mercy of

    the language • Whack-a-mole fixes • What if…. Credit: unsplash.com
  2. Understanding is the key • How to wield the tool

    brings mastery • Knowing when and how —> • Efficiency • Better results Credit: unsplash.com
  3. Data set • Venmo payments • 2018-07-26 -> 2018-08-07 (~2weeks

    of data) • Venmo retired API summer of 2018
  4. Pitfalls (mine, at least) • Logic like other languages (e.g.

    Java) • Variable-setting • Complex conditionals Credit: unsplash.com
  5. Examples //Bad syntax LOAD CSV…AS row MERGE (c:Company {companyId: row.Id})

    WITH row, c, CASE row.type WHEN 'P' THEN row.type = 'Public' WHEN 'R' THEN row.type = 'Private' WHEN 'G' THEN row.type = 'Government' ELSE row.type = 'Other' END SET c.businessType = row.type RETURN * //Bad syntax 2 LOAD CSV…AS row WITH row, CASE row.BusinessType WHEN 'P' THEN type = 'Public' WHEN 'R' THEN type = 'Private' WHEN 'G' THEN type = 'Government' ELSE type = 'Other' END RETURN row.CompanyId, row.CompanyName, type
  6. Correct Logic • 1. Expression comparison against multiple values MATCH

    (p:Person)-[r:IS_MANAGED_BY]->(m:Manager)-[r2:OVERSEES]->(d:Department) RETURN p.name, CASE p.role WHEN 'management' THEN d.departmentPhone WHEN 'business' THEN p.businessPhone WHEN 'technical' THEN p.emailAddress ELSE d.departmentEmail END as personContact • 2. Multiple conditional statements expressed MATCH (p:Person) RETURN p.name, CASE WHEN dateHired is null THEN 'candidate' WHEN dateHired > date('2018–07–24') THEN 'newHire' ELSE 'employee' END as personStatus, CASE WHEN dateFired is null THEN dateHired WHEN dateHired is null THEN entryDate ELSE 'n/a' END as leadDate
  7. Example MATCH (u:User)-[:SENDS]->(:Payment)-[:PAID_USING]->(a:Application) WITH u, a, CASE a.name WHEN 'Venmo

    for iPhone' THEN 'iPhone' WHEN 'Venmo for Android' THEN 'Android' ELSE 'unknown' END as userPhoneType SET u.phoneType = userPhoneType RETURN u.displayName, userPhoneType LIMIT 10;
  8. Pitfalls • Date format • ISO 8601 • Pesky literal

    'T' • Durations • Precision • Groups Credit: unsplash.com
  9. Date format • ISO 8601 standard • Date: 2021-06-14 •

    Datetime: 2021-06-14T19:04:18Z • Literal 'T' between date and time values Credit: unsplash.com
  10. Translating to ISO 8601 • Cypher accepts: • ISO 8601

    • Strings in ISO 8601 format • What if you have ANYTHING else? • Error!
  11. What is APOC? • Utility library for Neo4j • Widely

    used, broadly applicable • 550+ procedures and functions Credit: unsplash.com
  12. Epoch time • Unix system date/time • Processes, logs, etc.

    • Seconds since 1970-01-01T00:00:00Z • Example: 1623958200 (end time for this session) • Handled with apoc.date.format() or apoc.date.toISO8601() https://www.epochconverter.com/
  13. Examples WITH 1623958200 as sessionEnd RETURN apoc.date.format(sessionEnd, "s", "yyyy-MM-dd'T'HH:mm:ss") as

    formattedString; WITH 1623958200 as sessionEnd RETURN apoc.date.format(sessionEnd, "s", "yyyy-MM-dd") as formattedDateString; WITH 1623958200 as sessionEnd RETURN apoc.date.toISO8601(sessionEnd,"s") as converted;
  14. Date Strings • Process: • Temporal string -> ISO 8601

    string -> Neo4j temporal • Any temporal string with specified format • Handled with apoc.date.convertFormat() or apoc.temporal.toZonedTemporal()
  15. Examples WITH '2021-06-17 19:00:00' as sessionStart RETURN apoc.date.convertFormat(sessionStart,"yyyy-MM-dd HH:mm:ss", 'iso_date_time')

    as formattedString; WITH '2021/06/17' as sessionDay RETURN apoc.date.convertFormat(sessionDay,”yyyy/MM/dd”,'iso_date') as formattedDateString; WITH 'Thu Jun 17 19:30:00 -0600 2021' as sessionEnd RETURN apoc.temporal.toZonedTemporal(sessionEnd,'EEE LLL dd HH:mm:ss Z yyyy') as cypherDatetime;
  16. Durations • Distance in time measurements • Literal ‘P’ and

    ’T’ in syntax • Precision - months, days, seconds • P1M3D60000S • Duration functions (between, inMonths, inDays, inSeconds) Credit: unsplash.com
  17. Examples MATCH (p:Payment) WHERE p.dateComplete IS NULL OR p.dateComplete >

    p.dateCreated + duration('P1D') RETURN p.paymentId, p.status, p.dateCreated, p.dateComplete LIMIT 10; MATCH (u:User)-[:SENDS]->(p:Payment) WITH u, p ORDER BY p.dateCreated WITH u, collect(p.dateCreated) as payDates RETURN u.displayName, u.dateJoined, payDates[0], duration.between(u.dateJoined, payDates[0]) as firstUseTime ORDER BY firstUseTime LIMIT 10; WITH datetime(‘2021-06-17T12:00:00’) as nodesStart RETURN duration.between(nodesStart, datetime()) as contentPresented;
  18. Duration conversions • Duration functions: inMonths, inDays, inSeconds • Most

    components -> whole values only, no remainders • Convert between units: functions + components • Component groups
  19. Examples MATCH (u:User)-[:SENDS]->(p:Payment) WITH u, p ORDER BY p.dateCreated WITH

    u, collect(p.dateCreated) as payDates RETURN u.displayName, u.dateJoined, payDates[0], duration.inMonths(u.dateJoined, payDates[0]).years as years, duration.inMonths(u.dateJoined, payDates[0]).monthsOfYear as months ORDER BY years DESC, months DESC LIMIT 10; WITH datetime(‘2021-06-17T12:00:00’) as nodesStart RETURN duration.inSeconds(nodesStart, datetime()).hours as hours, duration.inSeconds(nodesStart, datetime()).minutesOfHour as minutes;
  20. What is Eager? • Eager loading • Consistency and conflicts

    • Operations occur to all rows before continuing • Avoid read/write conflicts Credit: unsplash.com
  21. Does it matter? • Low memory/heap • Large dataset •

    Query performance Credit: unsplash.com
  22. Example //Eager PROFILE WITH 'https://raw.githubusercontent.com/JMHReif/ nodes2021-aura-training/main/venmo_demo.csv' as file LOAD CSV

    WITH HEADERS FROM file AS line MERGE (from:User {userId: line.`payment.actor.id`}) WITH line, from MATCH (pay:Payment {paymentId: line.`payment.id`}) MERGE (from)-[r:SENDS]->(pay) RETURN count(*); //Non-eager PROFILE WITH 'https://raw.githubusercontent.com/JMHReif/ nodes2021-aura-training/main/venmo_demo.csv' as file LOAD CSV WITH HEADERS FROM file AS line MERGE (from:User {userId: line.`payment.actor.id`}) WITH line, from MATCH (pay:Payment {paymentId: line.`payment.id`}) MERGE (from)-[r:SENDS]->(pay);
  23. Example //Non-eager PROFILE WITH 'https://raw.githubusercontent.com/JMHReif/ nodes2021-aura-training/main/venmo_demo.csv' as file LOAD CSV

    WITH HEADERS FROM file AS line MERGE (app:Application {applicationId: line.`app.id`}) WITH line, app MERGE (pay:Payment {paymentId: line.`payment.id`}) WITH line, app, pay MERGE (pay)-[r2:PAID_USING]->(app); //Eager PROFILE WITH 'https://raw.githubusercontent.com/JMHReif/ nodes2021-aura-training/main/venmo_demo.csv' as file LOAD CSV WITH HEADERS FROM file AS line MERGE (app:Application {applicationId: line.`app.id`}) WITH line, app MERGE (pay:Payment {paymentId: line.`payment.id`}) WITH line, app, pay MATCH (p:Payment {paymentId: line.`payment.id`}) MERGE (p)-[r2:PAID_USING]->(app);
  24. How to avoid eager? • Separate operations • Avoids situations

    where conflicts might occur • Use PROFILE/EXPLAIN on queries • Having trouble? Ask for help! • dev.neo4j.com/forum -> Cypher channel
  25. CASE reminders • Uses: • Compare against values • Multiple

    conditionals • NOT: • Programming language logic • Variable-setting Credit: unsplash.com
  26. Temporals reminders • Formats: • Temporal/string in ISO 8601 •

    Other must be translated • Durations: • Groupings (months, days, seconds) • Component groups for translations Credit: unsplash.com
  27. Eager reminders • Data set: • Large set • Heavy

    processing (i.e. cleaning, operations, etc) • AND queries slow • To do: • Separate operations • PROFILE/EXPLAIN is your friend! Credit: unsplash.com
  28. Resources • CASE statement: • Blog post: jennifer-reif.medium.com/cypher-sleuthing-the-case-statement-3e47f5edd38b • Cypher

    temporal: • Blog post: neo4j.com/developer-blog/cypher-sleuthing-dealing-with-dates-part-1/ • Blog post: neo4j.com/developer-blog/cypher-sleuthing-dealing-with-dates-part-2/ • Eager operator: • Blog post: medium.com/neo4j/cypher-sleuthing-the-eager-operator-84a64d91a452 • Ask for help! • dev.neo4j.com/forum -> Cypher channel Jennifer Reif Email: [email protected] Twitter: @JMHReif LinkedIn: linkedin.com/in/jmhreif Credit: unsplash.com