volume in recent years. • Ninety percent of world’s data has been generated in last two years. • Much data is unstructured and exhibits “rela>onships” between objects. Source: IDC Data Volume (Exabytes) Are rela%onal databases the best architecture to meet all future GIS data needs?
rela>onal databases for storing connected network data . Example transporta%on network • Should GIS plaMorms begin to integrate database architectures?
model is a directed weighted graph. • Physical proper>es are abstracted as edge costs. • Network models are typically in tabular format. Typical Connected Network Model
• A row represents a graph {road} edge {segment}. • Each edge defines a “source” and “target” node. • Costs control traversing in forward and reverse. Implied direc%on
Storage Uses Na=ve Graph Storage • Issues to consider when selec>ng alterna>ve o Size of Dataset o Expected Depth of Traversing o Number of concurrent users o Degree of “connec>vity” o Complexity of rela>onships
Advantage o pgRou>ng was an order of magnitude smaller. Hard Drive Storage • Database Build Time Advantage o Neo4j was much faster. Database Build Time
• pgRou>ng does not share memory between threads. • Run>me Memory Advantage • pgRou>ng reloads en>re dataset for each traversal pgRou=ng Memory History Neo4j Memory History
• Neo4j is constant with size… pgRou>ng degrades with size. • Traversal Depth Advantage … depends on applica>on. • For deep traversals on small datasets • For shallow traversals on large datasets
§ Each data object is uniquely iden>fied with URI. Linked Data … the Next Web Fron=er § Links describe rela>onships between data. § Rela>onships enable automated data discovery. Links Data § Traversal depths are typically shallow.
for .. • Storing physical model. • Data visualiza>on. § Neo4j used for … • Storing logical model • Graph traversals Open Source System Architecture Implemented in the IBM Cloud!!!