Slide 1

Slide 1 text

SCALING THE DATA LANDSCAPE DATA CARTOGRAPHY: PREETI VAIDYA

Slide 2

Slide 2 text

DATA IS DYNAMIC BECAUSE HUMAN ACTIVITIES ARE DYNAMIC

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

SPOKE – HUB MODEL The spoke-hub distribution paradigm is a form of transport topology optimization in which traffic routes are organized as a series of 'spokes' that connect outlying points to a central 'hub.’ Wikipedia

Slide 5

Slide 5 text

SPOKE – HUB MODEL: APPLICATIONS Aviation • Centralizing operations at the hub leads to economies of scale. • Significantly less routes are needed to serve the network. • Number of pairings in a P2P network increases at a greater rate than the increase in nodes. (O(n^2)). Networks, Course blog for INFO 2040/CS 2850/Econ 2040/SOC 2090, Cornell University Content Marketing’s Critical Role in Social Media Ruben Sanchez Healthcare The “New Hub and Spoke” model has similarity in base structure with primary, secondary and tertiary care settings within a network, with a focus on patient ‘well-being’. The New Hub and Spoke Model: Redirecting the Flow of Patient Care, Haskell

Slide 6

Slide 6 text

Harkness K, Heckman G, McKelvie R, Forsey A and Kingsbury K. Outpatient Management of Patients with Heart Failure. Austin J Clin Cardiolog. 2014;1(4): 1028. ISSN 2381-9111

Slide 7

Slide 7 text

DATA MODELS ARE DYNAMIC BECAUSE HUMAN EVENTS ARE DYNAMIC

Slide 8

Slide 8 text

MAPPING THE DATA LANDSCAPE Website for storing flight information, rendering paths on a zoomable world map and calculating statistics, with plenty of free airline, airport and route data, https://github.com/jpatokal/openflights QUESTION: What should be the underlying Data Store for this Data Model?

Slide 9

Slide 9 text

DATA MODEL DataModel is an abstraction around arbitrary data binding technologies that can be used to adapt a variety of data sources for use by JavaServer Faces components that support per-row processing for their child components (such as UIData). Java Docs, https://docs.oracle.com/javaee/6/api/javax/faces/model/DataModel.html Website for storing flight information, rendering paths on a zoomable world map and calculating statistics, with plenty of free airline, airport and route data, https://github.com/jpatokal/openflights Tabular Data: It looks like a Relational Data Store can solve our problem!

Slide 10

Slide 10 text

DATA MODEL Data management in cloud environments: NoSQL and NewSQL data stores, Grolinger et al. Journal of Cloud Computing: Advances, Systems and ApplicationsAdvances, Systems and Applications2013

Slide 11

Slide 11 text

DO BE. DB WHAT DB?

Slide 12

Slide 12 text

DATA STORE Cities in Worldwide Air and Sea Flows: A multiple networks analysis César Ducruet, Daniele Ietri et Céline Rozenblat, European Journal of Geography - We care about the hubs and the spokes, as well as the number of spokes. - Leveled analysis requires a lot of joins. - Carrying out the computations (z-scores) would require us to create views, and temp tables or nested queries. - Performance optimizations would be another area of effort.

Slide 13

Slide 13 text

GRAPH DATABASES Debian Packages as Graph Database, NORBERT PREINING, Neo4j

Slide 14

Slide 14 text

GRAPH DATABASE A graph database is a graph-oriented database, which is type of NoSQL database that uses graph theory to store, map and query relationships. It is basically a collection of nodes and edges. Gouri Ginde, Visualisation of massive data from scholarly Article and Journal Database: A Novel Scheme A graph database is a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data. A key concept of the system is the graph (or edge or relationship), which directly relates data items in the store. The relationships allow data in the store to be linked together directly, and in many cases retrieved with one operation. Wikipedia Graph Data Model docs.neo4j.org — 3.1 Nodes

Slide 15

Slide 15 text

GRAPH DATABASE IMPLEMENTATION Native Graph Database docs.neo4j.org Native Graph Database docs.neo4j.org

Slide 16

Slide 16 text

DISTRIBUTED GRAPH

Slide 17

Slide 17 text

DISTRIBUTED GRAPH PERFORMANCE

Slide 18

Slide 18 text

DATA MODELS ARE DYNAMIC AND DATA APPLICATIONS ARE DYNAMIC

Slide 19

Slide 19 text

QUESTION: How do we all decide to go somewhere after the conference?

Slide 20

Slide 20 text

BYZANTINE GENERALS PROBLEM

Slide 21

Slide 21 text

APPLICATION TO OPERATION

Slide 22

Slide 22 text

APPLICATION TO OPERATION

Slide 23

Slide 23 text

DATA MODELS ARE DYNAMIC DATA APPLICATIONS ARE DYNAMIC DATA OPERATIONS ARE DYNAMIC

Slide 24

Slide 24 text

TYING IT ALL TOGETHER: THE STATIC Operation Application Data Model Data Store

Slide 25

Slide 25 text

DO IT RIGHT! DO IT ONCE! HOPEFULLY!

Slide 26

Slide 26 text

Thank You!