Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data Cartography

Preeti
June 01, 2018

Data Cartography

Technology has gone through many iterations - Mainframes, to Client-Service Models, and now Isomorphic Applications. The world of Data Management has evolved too, from Databases, to Warehouses, Sharehouses, and Data Lakes, consequently Data Catalogues. Each of these iterations have given us learnings, and paved the way for better solutions!

The constantly changing data landscape has helped us look at new ways of solving problems, but at the same time, has also introduced new challenges. The talk addresses the problem of picking the right Data Store, and elaborates on a four step Design Paradigm to help you pick the right Data Store!

The presentation video can be found at - https://www.dotconferences.com/conference/dotscale.

Preeti

June 01, 2018
Tweet

Other Decks in Technology

Transcript

  1. SPOKE – HUB MODEL The spoke-hub distribution paradigm is a

    form of transport topology optimization in which traffic routes are organized as a series of 'spokes' that connect outlying points to a central 'hub.’ Wikipedia
  2. SPOKE – HUB MODEL: APPLICATIONS Aviation • Centralizing operations at

    the hub leads to economies of scale. • Significantly less routes are needed to serve the network. • Number of pairings in a P2P network increases at a greater rate than the increase in nodes. (O(n^2)). Networks, Course blog for INFO 2040/CS 2850/Econ 2040/SOC 2090, Cornell University Content Marketing’s Critical Role in Social Media Ruben Sanchez Healthcare The “New Hub and Spoke” model has similarity in base structure with primary, secondary and tertiary care settings within a network, with a focus on patient ‘well-being’. The New Hub and Spoke Model: Redirecting the Flow of Patient Care, Haskell
  3. Harkness K, Heckman G, McKelvie R, Forsey A and Kingsbury

    K. Outpatient Management of Patients with Heart Failure. Austin J Clin Cardiolog. 2014;1(4): 1028. ISSN 2381-9111
  4. MAPPING THE DATA LANDSCAPE Website for storing flight information, rendering

    paths on a zoomable world map and calculating statistics, with plenty of free airline, airport and route data, https://github.com/jpatokal/openflights QUESTION: What should be the underlying Data Store for this Data Model?
  5. DATA MODEL DataModel is an abstraction around arbitrary data binding

    technologies that can be used to adapt a variety of data sources for use by JavaServer Faces components that support per-row processing for their child components (such as UIData). Java Docs, https://docs.oracle.com/javaee/6/api/javax/faces/model/DataModel.html Website for storing flight information, rendering paths on a zoomable world map and calculating statistics, with plenty of free airline, airport and route data, https://github.com/jpatokal/openflights Tabular Data: It looks like a Relational Data Store can solve our problem!
  6. DATA MODEL Data management in cloud environments: NoSQL and NewSQL

    data stores, Grolinger et al. Journal of Cloud Computing: Advances, Systems and ApplicationsAdvances, Systems and Applications2013
  7. DATA STORE Cities in Worldwide Air and Sea Flows: A

    multiple networks analysis César Ducruet, Daniele Ietri et Céline Rozenblat, European Journal of Geography - We care about the hubs and the spokes, as well as the number of spokes. - Leveled analysis requires a lot of joins. - Carrying out the computations (z-scores) would require us to create views, and temp tables or nested queries. - Performance optimizations would be another area of effort.
  8. GRAPH DATABASE A graph database is a graph-oriented database, which

    is type of NoSQL database that uses graph theory to store, map and query relationships. It is basically a collection of nodes and edges. Gouri Ginde, Visualisation of massive data from scholarly Article and Journal Database: A Novel Scheme A graph database is a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data. A key concept of the system is the graph (or edge or relationship), which directly relates data items in the store. The relationships allow data in the store to be linked together directly, and in many cases retrieved with one operation. Wikipedia Graph Data Model docs.neo4j.org — 3.1 Nodes