Save 37% off PRO during our Black Friday Sale! »

Data Cartography

0cc84d26d891446e9a018b758145cb37?s=47 Preeti
June 01, 2018

Data Cartography

Technology has gone through many iterations - Mainframes, to Client-Service Models, and now Isomorphic Applications. The world of Data Management has evolved too, from Databases, to Warehouses, Sharehouses, and Data Lakes, consequently Data Catalogues. Each of these iterations have given us learnings, and paved the way for better solutions!

The constantly changing data landscape has helped us look at new ways of solving problems, but at the same time, has also introduced new challenges. The talk addresses the problem of picking the right Data Store, and elaborates on a four step Design Paradigm to help you pick the right Data Store!

The presentation video can be found at - https://www.dotconferences.com/conference/dotscale.

0cc84d26d891446e9a018b758145cb37?s=128

Preeti

June 01, 2018
Tweet

Transcript

  1. SCALING THE DATA LANDSCAPE DATA CARTOGRAPHY: PREETI VAIDYA

  2. DATA IS DYNAMIC BECAUSE HUMAN ACTIVITIES ARE DYNAMIC

  3. None
  4. SPOKE – HUB MODEL The spoke-hub distribution paradigm is a

    form of transport topology optimization in which traffic routes are organized as a series of 'spokes' that connect outlying points to a central 'hub.’ Wikipedia
  5. SPOKE – HUB MODEL: APPLICATIONS Aviation • Centralizing operations at

    the hub leads to economies of scale. • Significantly less routes are needed to serve the network. • Number of pairings in a P2P network increases at a greater rate than the increase in nodes. (O(n^2)). Networks, Course blog for INFO 2040/CS 2850/Econ 2040/SOC 2090, Cornell University Content Marketing’s Critical Role in Social Media Ruben Sanchez Healthcare The “New Hub and Spoke” model has similarity in base structure with primary, secondary and tertiary care settings within a network, with a focus on patient ‘well-being’. The New Hub and Spoke Model: Redirecting the Flow of Patient Care, Haskell
  6. Harkness K, Heckman G, McKelvie R, Forsey A and Kingsbury

    K. Outpatient Management of Patients with Heart Failure. Austin J Clin Cardiolog. 2014;1(4): 1028. ISSN 2381-9111
  7. DATA MODELS ARE DYNAMIC BECAUSE HUMAN EVENTS ARE DYNAMIC

  8. MAPPING THE DATA LANDSCAPE Website for storing flight information, rendering

    paths on a zoomable world map and calculating statistics, with plenty of free airline, airport and route data, https://github.com/jpatokal/openflights QUESTION: What should be the underlying Data Store for this Data Model?
  9. DATA MODEL DataModel is an abstraction around arbitrary data binding

    technologies that can be used to adapt a variety of data sources for use by JavaServer Faces components that support per-row processing for their child components (such as UIData). Java Docs, https://docs.oracle.com/javaee/6/api/javax/faces/model/DataModel.html Website for storing flight information, rendering paths on a zoomable world map and calculating statistics, with plenty of free airline, airport and route data, https://github.com/jpatokal/openflights Tabular Data: It looks like a Relational Data Store can solve our problem!
  10. DATA MODEL Data management in cloud environments: NoSQL and NewSQL

    data stores, Grolinger et al. Journal of Cloud Computing: Advances, Systems and ApplicationsAdvances, Systems and Applications2013
  11. DO BE. DB WHAT DB?

  12. DATA STORE Cities in Worldwide Air and Sea Flows: A

    multiple networks analysis César Ducruet, Daniele Ietri et Céline Rozenblat, European Journal of Geography - We care about the hubs and the spokes, as well as the number of spokes. - Leveled analysis requires a lot of joins. - Carrying out the computations (z-scores) would require us to create views, and temp tables or nested queries. - Performance optimizations would be another area of effort.
  13. GRAPH DATABASES Debian Packages as Graph Database, NORBERT PREINING, Neo4j

  14. GRAPH DATABASE A graph database is a graph-oriented database, which

    is type of NoSQL database that uses graph theory to store, map and query relationships. It is basically a collection of nodes and edges. Gouri Ginde, Visualisation of massive data from scholarly Article and Journal Database: A Novel Scheme A graph database is a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data. A key concept of the system is the graph (or edge or relationship), which directly relates data items in the store. The relationships allow data in the store to be linked together directly, and in many cases retrieved with one operation. Wikipedia Graph Data Model docs.neo4j.org — 3.1 Nodes
  15. GRAPH DATABASE IMPLEMENTATION Native Graph Database docs.neo4j.org Native Graph Database

    docs.neo4j.org
  16. DISTRIBUTED GRAPH

  17. DISTRIBUTED GRAPH PERFORMANCE

  18. DATA MODELS ARE DYNAMIC AND DATA APPLICATIONS ARE DYNAMIC

  19. QUESTION: How do we all decide to go somewhere after

    the conference?
  20. BYZANTINE GENERALS PROBLEM

  21. APPLICATION TO OPERATION

  22. APPLICATION TO OPERATION

  23. DATA MODELS ARE DYNAMIC DATA APPLICATIONS ARE DYNAMIC DATA OPERATIONS

    ARE DYNAMIC
  24. TYING IT ALL TOGETHER: THE STATIC Operation Application Data Model

    Data Store
  25. DO IT RIGHT! DO IT ONCE! HOPEFULLY!

  26. Thank You!