$30 off During Our Annual Pro Sale. View Details »

The Hitchhiker's Guide to D&D 🐉

The Hitchhiker's Guide to D&D 🐉

This talk is meant to be an hitchhickers guide to Dungeons and Dragons (D&D) for programmers.
We will leverage on our wit and intelligence to explore a very perilious dungeon 🧙 , where a venomous dragon is hiding in the shadows 🐉 .
Thanks to a magical potion in an ancient flask, our wizardly skills have been enhanced with Pythonic capabilities 🐍 making us the most powerful and geeky magician of the realm.
These new acquired power revealed unprecedented strategies (i.e. algorithms 🙃) that will guide us through the maze avoiding all the traps and pitfalls ⚔️, and will help us maximising the power of our fire magic ☄️ to finally slay the dragon.
If you would like to know more about this new Pythonic spell, and the secrets it unveiled, or if you're simply interested in new graph algorithms that can run balzingly fast maximising your CPU capabilities, this is the talk for you!
Description
I am playing D&D since I was 13, and that is indeed a fantastic game for so many reasons. Even more fantastic if you could combine your geeky programming skills to it, whenever you have to explore hidden dungeons.
In facts, graphs are the most versatile, and fascinating data abstraction that could help us handling these challenges in a very programmatic way.
In Python we would have many solutions to work with Graph problems: from scipy.sparse to networkx.
However, none of these solutions are generally known to be fast and efficient, and this can represent a huge impediment when the size of the graph in question (in terms of nodes, and edges) increases.
But What if graph algorithms could be expressed as linear algebraic operations ? And what if this translation would make graph algorithms super efficient so that this would represent a viable and scalable alternative for high-performance graph analytics ?
And what if we could leverage on these new and blanzingly fast algiorithms, while still using the same libraries (and abstractions) we would normally use with networkx?
In this talk, we will introduce python-graphblas, i.e. the official Python API to GraphBLAS: a powerful framework for creating graph algorithms expressed as linear algebra sparse matrix operations.We will explore two practical examples (working on our D&D use case) showcasing performance, and how python-graphblas with graphblas-algorithms integrate with networkX.

Valerio Maggio

July 21, 2023
Tweet

More Decks by Valerio Maggio

Other Decks in Programming

Transcript

  1. The Hitchhiker’s Guide
    To Dungeons & Dragons 🐉
    [email protected]
    @leriomaggio
    Valerio Maggio

    View Slide

  2. Still
    • Researcher and Data Scientist


    • ML/DL for BioMedicine


    • Data Scientists Advocate


    • SSI Fellow
    me
    Who
    “a short summary of myself in logos”
    I’m Valerio

    View Slide

  3. Still
    • Researcher and Data Scientist


    • ML/DL for BioMedicine


    • Data Scientists Advocate


    • SSI Fellow


    • Python & Hipster Geek
    me
    pun
    Who
    “a short summary of myself in logos”
    I’m Valerio

    View Slide

  4. We have a (D&D) problem

    View Slide

  5. We have a (D&D) problem

    View Slide

  6. We have a (D&D) problem

    View Slide

  7. We have a
    (D&D)
    problem in
    Prague

    View Slide

  8. We have a (D&D) problem

    View Slide

  9. We have a (D&D) problem
    1. Shortest Path (SSSP)


    to reach the Dragon

    2. Breadth-
    f
    irst Search (BFS)


    to maximise Fireball’s effect in the woods



    View Slide

  10. View Slide

  11. Graph Vertex

    View Slide

  12. Graph Edges

    (Arcs)

    View Slide

  13. Connected
    Graph

    View Slide

  14. Fully Connected
    Graph

    View Slide

  15. Undirected
    Graph

    View Slide

  16. Directed Graph

    View Slide

  17. Degrees

    View Slide

  18. View Slide

  19. View Slide

  20. We still have a (D&D) problem

    View Slide

  21. We still have a (D&D) problem
    1. Shortest Path (SSSP)


    to reach the Dragon

    2. Breadth-
    f
    irst Search (BFS)


    to maximise Fireball’s effect in the woods

    View Slide

  22. Graph
    Abstractions
    With Python

    View Slide

  23. Graphs as..

    References
    https://www.python.org/doc/essays/graphs/

    View Slide

  24. Graphs as

    Adjacency
    Lists

    View Slide

  25. Graphs as

    Adjacency
    Dicts

    View Slide

  26. Graphs as

    Adjacency
    Lists / Sets

    An even more
    fl
    exible approach

    View Slide

  27. Graphs as

    ( Sparse ) Adjacency
    Matrix
    Unweighted Graph
    Weighted Graph
    Note:

    Undirected Graph, Symmetric Matrix

    (Triangular)

    View Slide

  28. 1
    2
    5
    4
    3
    7
    6
    1 2 3 4 5 6 7
    1 1 1
    2 1 1
    3 1
    4 1 1
    5 1
    6 1
    7 1 1 1
    1 2 3 4 5 6 7
    Node values
    Edge Weights (unweighted)
    If the graph were undirected, the
    adjacency matrix would be symmetric.
    Graphs as

    ( Sparse ) Adjacency
    Matrix

    View Slide

  29. 1
    2
    5
    4
    3
    7
    6
    Graphs as

    ( Sparse ) Adjacency
    Matrix
    1
    2
    5
    4
    3
    7
    6
    1 2 3 4 5 6 7
    1 1 1
    2 1 1
    3 1
    4 1 1
    5 1
    6 1
    7 1 1 1
    Edge Weights (unweighted)
    Rows represent
    outgoing edges
    1 2 3 4 5 6 7
    Node values

    View Slide

  30. 1
    2
    5
    4
    3
    7
    6
    Graphs as

    ( Sparse ) Adjacency
    Matrix
    1
    2
    5
    4
    3
    7
    6
    1 2 3 4 5 6 7
    1 1 1
    2 1 1
    3 1
    4 1 1
    5 1
    6 1
    7 1 1 1
    Edge Weights
    Columns represent
    incoming edges
    1 2 3 4 5 6 7
    Node values

    View Slide

  31. Graphs as ( Sparse ) Adjacency Matrix
    scipy.sparse

    View Slide

  32. scipy.sparse
    Graphs as ( Sparse ) Adjacency Matrix

    View Slide

  33. Graphs as
    Pythonic
    Data
    Abstractions

    View Slide

  34. Graphs as
    Pythonic Data
    Abstractions

    View Slide

  35. • “Reference implementation in Python”


    • Well-known and popular


    • Many algorithms & Well documented


    • Nice to read


    • Great for small graphs
    SLOW
    Pros Cons
    Compute Time vs Graph Size


    (generalization – not real data)
    Tiny Small Big Huge
    Scipy.sparse NetworkX Numpy

    View Slide

  36. What if ?
    Still use networkX


    Have faster Sparse Graph algorithms

    View Slide

  37. Foundational Sparse Graphs Library
    Fast


    Flexible


    Scalable


    “Runs on any architecture”

    View Slide

  38. View Slide

  39. scipy.sparse
    is not that library

    View Slide

  40. scipy.sparse is not that library
    • (Still) too slow;


    • single-threaded


    • Not expressive enough


    • Masking operations

    to work e
    ffi
    ciently


    • Change operator in

    matrix-multiply


    • Too low level


    • No integration with
    NetworkX


    • Format “gymnastic”

    (e.g. COO 2 CSR)


    • Not (yet) Hardware /
    Implementation Agnostic

    View Slide

  41. Graph Problems = Sparse
    Linear Algebra
    Introducing GraphBLAS

    View Slide

  42. Graph Problems = Sparse
    Linear Algebra

    View Slide

  43. Graph Problems = Sparse
    Linear Algebra
    • Graphs are represented as

    Sparse Matrix


    • Matrix-Multiplication is
    foundational to all graph operations


    • With Custom Operator
    SSSP

    View Slide

  44. Graph Problems = Sparse
    Linear Algebra

    View Slide

  45. The GraphBLAS Standard

    View Slide

  46. GraphBLAS ←→ NetworkX
    NetworkX
    (dispatching!)
    graphblas-algorithms
    python-graphblas
    SuiteSparse:GraphBLAS
    GraphBLAS C API specification
    GraphBLAS Math specification
    python
    -
    graphblas

    View Slide

  47. GraphBLAS ←→ NetworkX
    NetworkX
    graphblas-algorithms cuGraph ...
    python-graphblas
    CPU GPU Dask
    Dask

    View Slide

  48. The Stack
    Math specification C = C min (A.T min.plus v)
    GraphBLAS pseudo-code to express math often looks like this.


    Linear algebra formulation is concise and exposes parallelism.

    View Slide

  49. Math specification C = C min (A.T min.plus v)
    C specification
    GrB_mxv(


    C, NULL, GrB_MIN_FP64,


    GrB_MIN_PLUS_SEMIRING_FP64,


    A, v, GrB_DESC_T0)
    GraphBLAS in C is very verbose and hard to read.


    A lot goes into a single call.


    GraphBLAS is a speci
    f
    ication, not an implementation.


    Objects are opaque; data structures are not part of the spec.
    The Stack

    View Slide

  50. Math specification C = C min (A.T min.plus v)
    C specification
    Implementations


    (SuiteSparse:GraphBLAS)
    GrB_mxv(


    C, NULL, GrB_MIN_FP64,


    GrB_MIN_PLUS_SEMIRING_FP64,


    A, v, GrB_DESC_T0)
    SuiteSparse:GraphBLAS is the primary implementation.


    It is fast and has state-of-the-art OpenMP parallelism.


    MATLAB uses it for sparse matrix multiply.


    Much more: automatic data structures; JIT; zero-copy import/export; …


    GPU support is coming soon!
    Formats:


    CSR/CSC


    DCSR/DCSC


    Bitmap


    Dense


    Iso-valued
    The Stack

    View Slide

  51. Math specification C = C min (A.T min.plus v)
    C specification
    Implementations


    (SuiteSparse:GraphBLAS)
    python
    -
    graphblas
    GrB_mxv(


    C, NULL, GrB_MIN_FP64,


    GrB_MIN_PLUS_SEMIRING_FP64,


    A, v, GrB_DESC_T0)
    C(min)
    < <
    min_plus(A.T @ v)
    python
    -
    graphblas makes writing GraphBLAS easy and looks like math!


    $ pip install python
    -
    graphblas


    $ conda install
    -
    c conda
    -
    forge python
    -
    graphblas


    https:
    / /
    python
    -
    graphblas.readthedocs.io/
    The Stack

    View Slide

  52. Math specification C = C min (A.T min.plus v)
    C specification
    Implementations


    (SuiteSparse:GraphBLAS)
    python
    -
    graphblas
    graphblas
    -
    algorithms
    GrB_mxv(


    C, NULL, GrB_MIN_FP64,


    GrB_MIN_PLUS_SEMIRING_FP64,


    A, v, GrB_DESC_T0)
    C(min)
    < <
    min_plus(A.T @ v)
    ga.single_source_shortest_path(G, s)
    graphblas
    -
    algorithms has algorithms written with python
    -
    graphblas.


    It currently implements 80+ NetworkX algorithms!
    The Stack

    View Slide

  53. Math specification C = C min (A.T min.plus v)
    C specification
    Implementations


    (SuiteSparse:GraphBLAS)
    python-graphblas
    graphblas-algorithms
    networkx
    GrB_mxv(


    C, NULL, GrB_MIN_FP64,


    GrB_MIN_PLUS_SEMIRING_FP64,


    A, v, GrB_DESC_T0)
    C(min) << min_plus(A.T @ v)
    ga.single_source_shortest_path(G, s)
    nx.single_source_shortest_path(G, s)
    Dispatching in NetworkX!


    Vision: accelerate libraries that use NetworkX.
    The Stack
    Math specification C = C min (A.T min.plus v)
    C specification
    Implementations


    (SuiteSparse:GraphBLAS)
    python
    -
    graphblas
    graphblas
    -
    algorithms
    GrB_mxv(


    C, NULL, GrB_MIN_FP64,


    GrB_MIN_PLUS_SEMIRING_FP64,


    A, v, GrB_DESC_T0)
    C(min)
    < <
    min_plus(A.T @ v)
    ga.single_source_shortest_path(G, s)

    View Slide

  54. View Slide

  55. Code Changes Required


    To Speed Up NetworkX, only a few changes are required.


    1. Import graphblas_algorithms (convention is to use ga as the abbreviation)


    2.Convert the nx.Graph to a ga.Graph using a helper function


    3.Pass the ga.Graph to the networkx API
    import networkx as nx


    G = G = nx.erdos_renyi_graph(10000, p=0.02, directed=True)


    K = list(nx.all_pairs_shortest_path_length(G))
    import networkx as nx


    import graphblas_algorithms as ga


    G = nx.erdos_renyi_graph(8000, 0.02)


    GBls = ga.Graph.from_networkx(G)


    K = list(nx.all_pairs_shortest_path_length(Gbls))
    This takes 32 s
    This takes 3.4 s
    10_000 nodes, ~2_000_659 edges
    This takes ~840 ms (milli seconds)

    View Slide

  56. Hardware: NVIDIA DGX-1 CPU: Dual 20 Core Intel Xeon E5-2698 v4 2.2GHz RAM: 512 GB 2133 MHz DDR4 RDIMM
    MORE INFO: github.com/python-graphblas/graphblas-algorithms/pull/54

    View Slide

  57. Brief recap of GraphBLAS
    SuiteSparse::GraphBLAS
    GraphBLAS API Spec
    python-graphblas
    graphblas-algorithms
    GraphBLAS is solving graph algorithms in the language of sparse linear algebra
    python-suitesparse-graphblas
    C +
    OpenMP
    cff
    i
    +
    cython
    Python
    Python
    • Missing values ≠ 0


    • Semirings


    • Masks


    • Multiple internal formats

    (CSR/CSC, DCSR/DCSC, Masked Dense)


    • Highly tuned matrix multiply kernels

    (multi-core via OpenMP)


    • Zero copy import/export of dense numpy arrays


    • GPU support forthcoming
    plus_times
    semiring



    𝑘
    𝑎
    𝑖
    𝑘

    𝑏
    𝑘
    𝑗
    min_plus
    semiring


    min
    𝑘
    𝑎
    𝑖 𝑘
    +
    𝑏
    𝑘
    𝑗
    C(mask=M)
    < <
    min_plus(A @ B)

    View Slide

  58. Key Messages
    D&D is cool



    NetworkX is amazing! (Even if it is slow)


    GraphBLAS uses sparse linear algebra to solve graph problems


    (mathematically elegant and blazing fast)



    NetworkX can become The Graph API for Python


    (similar to numpy)

    View Slide

  59. Thank you very much

    for your kind attention
    Valerio Maggio
    [email protected]
    @leriomaggio

    View Slide