Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Do they teach Data Modeling anymore? (Builders ...

Do they teach Data Modeling anymore? (Builders Stage, WebSummit 2014. Dublin)

We are well into multiple years of NoSQL databases, whether its MongoDB, or Cassandra or CouchDB, while still holding to patterns and practices that we learnt from Oracle and MySQL and PostgreSQL. Whatever your buzzword fancy – whether its “cloud” or “BigData”, we as Builders are building towers and mansions atop increasing layers of abstraction, regardless of your favorite programming language or framework of choice, and becoming both producers and consumers of exponential quantities of data.

Anyone stop to think about what the data model actually is? Do we still need one? Should we just throw things into MongoDB collections, and of course, there will be some piece of data analytics technology that will give you the best visualization of what’s actually in there? Do we even need to worry about many-to-many relationships, and integrity constraints, and domain models any more? What about the transactional vs. the conceptual vs. the logical use cases for the data? And of course, you need to have indexes on your database otherwise it won’t run fast – do people even understand why any more?

Unstructured, voluminous data does not obviate the need for a data structure. Whether you’d like to call it a schema or a model or just a napkin sketch of what’s actually behind the code – the art of data modeling is eroding away. Perhaps it is no longer needed, but one would beg to differ.

This talk is about provoking a return to when required reading from the likes of Aho or Horowitz was necessary before jumping into HeadFirst Java – which is the recommended CS college textbook at UC for a course on Data Structures.

From MySQL and cursors and overloading Redis queues as a persistent data store (!) to MongoDB with a return to principles drawn from VSAM and CODASYL, this is less of a talk and more of a plea -

To think about the Model in the Data even as we rush to Build it.

Avatar for subbuthepeaceful

subbuthepeaceful

November 05, 2014
Tweet

Other Decks in Technology

Transcript

  1. Do they teach “Data Modeling” anymore? Abstract Thinkers wanted!! Subbu

    Balakrishnan CTO/Co-Founder, Good.Co The views and comments expressed in this presentation are solely the perspective of the author and do not represent any official communication from Good.Co, Inc.
  2. ... or in reality, were we just Running to Stand

    Still ? User Interface API Background Processes Import Network Signup/Signin Spread Chunks - DeDuplication - User? - Automatic Profile? - Public Strengths? - Notifications? - Relevancy? ... Post Sync Network Ready As of August 15, 2014 5pm # of users : 3,401 # of automatic profiles : 2,114,693 MongoDB 1 (Users) Pyschometrics as Linked Lists in MongoDB 2 Redis (Jobs) ElasticSearch (Network)
  3. For a startup, is it not logical to Blow Your

    House Down ? MySQL (users) Psychometrics as Normalized Relations in MySQL Redis (social network)
  4. Is That All? • Write some user stories (hopefully driven

    from a nice user experience) • Pick a programming framework (preferably in a language that you know) • Make technical choices based on some criteria (hopefully ones that fit within skill and budget and applicability) • Launch and test frequently (so no single choice prevents learning?) • Live-Die-Repeat? (With enough correlation and learning and optimization, we should be able to ... slay the beast?)
  5. Was I merely Stuck in a Moment That I Couldn’t

    Get Out Of .... Given Statistics + Machine Learning + Optimization + Signal Processing + Text Mining + Natural Language Processing Do we even need to • Draw a Logical Model of the Data before reaching for attribute definitions and relationships? • Draw a Data Flow Diagram before committing to github? • Ask about the lifecycle of a piece of Data beyond returning it to the User Interface? • Think about a Conceptual Model for what the Data represents?
  6. ... to simply, Another Time, Another Place. • Document-oriented -

    Key-Sequenced Data Sets with complex data structures • Graph-oriented - Navigational Networks with Nodes, Properties and (Verb) edges • Key-value oriented - Entry/Key-sequenced Data Sets with clustering and distribution • Wide-column oriented - Column-based serialization of relational tuples
  7. A Moment of Surrender brings much reflection A crisis of

    relevance and responsibility as a Builder
  8. Ain’t no Miracle Drug when it comes to Data •

    A Database is only as good as the your understanding of the Model you’re trying to build and support • BigData is a strategy and a movement, but not a replacement for understanding your Data and its Model • Volumes of unstructured data don’t obviate the need for a Structural understanding of your Data • Data Science is as much a collection of techniques as a mindset. It needs as much computing power as it needs thoughtful questions to be asked.
  9. But Who’s Going to Ride (That) Wild Horse? Abstract Thinking

    is generally optional, (or can’t have story points allocated to it, clearly doesn’t belong in the headlong rush of a startup, or the efficiency goals of a large IT organization, but .... should it?)
  10. More than (Ultraviolet) to Light My Way? (Please!) Technology is

    (not yet) a replacement for • a napkin sketch of your Data Structure • or a discussion about its flow • or a walkthrough of the relationships and verbs and patterns in it • or a workshop on questions that you might like the Data to answer • or a listing of the behavior you’d like to influence with the Data • or a diagram of how the transactional, and operational and analysis and decision- making and testing data sets work together
  11. Data modeling for the Speed of Life ... Data That

    is Independent of a User - Catalogs, Content, Inventory, Questions ... Data That is All About the User - Account, Profile, History, Workspace ... Data that is the Intersection - Orders, Reviews, Views, Subscriptions ... Your Platform Capabilities to deliver value and promote a business outcome User Interfaces
  12. ... to create compelling Things to Make and Do Data

    That is Independent of a User - Catalogs, Content, Inventory, Questions ... Data That is All About the User - Account, Profile, History, Workspace ... Data that is the Intersection - Orders, Reviews, Views, Subscriptions ... Your Platform Capabilities to deliver value and promote a business outcome User Interfaces Patterns that inform •Merchandising •Relevance •Segmentation •Preference •... Patterns that inform •Context •Social Application •Personality •Lifecycle •Relationship •... Data Model that promotes Engagement Data Model that promotes Engagement
  13. Can we afford to live With or Without It (You)

    ? What does your Data Model do for you?
  14. Thank You (Yes, on is it lunch time already) (No,

    on would I have made different choices) (Yes, on code and data are craft, not commodities) (Yes, on the point is for every Builder to ask why) (Yes, on the words in green)