Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Using APIs for reporting, data science and systems integration by Victor Olex

Using APIs for reporting, data science and systems integration by Victor Olex

More Decks by API Strategy & Practice Conference

Other Decks in Technology

Transcript

  1. © 2015 by Victor Olex Founder & CEO, SlashDB @agilevic

    APIs in Enterprise Using APIs for reporting, data science and systems integration
  2. 2002 at Amazon • All teams will henceforth expose their

    data and functionality through service interfaces. • Teams must communicate with each other through these interfaces. • There will be no other form of inter-process communication allowed: no direct linking, no direct reads of another team’s data store, no shared-memory model, no back-doors whatsoever. The only communication allowed is via service interface calls over the network. • It doesn’t matter what technology they use. • All service interfaces, without exception, must be designed from the ground up to be externalizable. That is to say, the team must plan and design to be able to expose the interface to developers in the outside world. No exceptions. • Anyone who doesn’t do this will be fired. • Thank you; have a nice day! 3 “ ” - Jeff Bezos
  3. 5 2015 Global IT Spend $2.3T Source: Forrester Research -

    Global Tech Market Outlook for 2015-2016 (after ZDNet)
  4. ETL/Data Warehousing 6 Analytical Systems • Data duplication • Stale

    data • Brittle overnight feeds • Central bottleneck • Does not scale out • Not easily accessible nor searchable
  5. Store It All in One Place? • It was hard

    with just on-premises systems • Illusory idea with today’s Cloud apps • Try it with your contact list for starters… 8
  6. 9

  7. What is Resource Oriented Architecture (“ROA”) • “Style of software

    architecture and programming paradigm for designing and developing software in the form of resources with RESTful interfaces.” – Wikipedia • Uniform data access layer to all data assets in their unobstructed form for reading and writing in various representations. – my take 10
  8. What is Resource Oriented Architecture Service Oriented • Represents Action

    • Transaction, Unit of Work • Message • API controlled by functional design • Harder to adapt and scale beyond “enterprise” • Harder to deprecate functionality Resource Oriented • Represents State • Addressable Resource • Update to Resource • API automatically evolves with data • Harder to model into complex transactions • Clients must be resilient to change 11
  9. • Single access point, but without copying data • Self-service

    reporting, data feeds or integrate with NoSQL API Shell Over Data 12
  10. Database Content as HTTP Resources 13 http://demo.slashdb.com/db/Chinook/Customer/CustomerId/1.html Service location •

    On the intranet, or • In the cloud Database name. Supported RDBMS: • MS-SQL, • Oracle • MySQL • PostgreSQL, and more Table to query Field to filter and value to lookup: • Text • Number • Date Data format • XML • JSON • HTML • CSV Combine several  /db automatically makes hyperlinks directly to data  Related records are hyperlinked thus search engine ready  Filtering, drill-down, slices are natural, URLs stay nice  Custom queries also possible (SQL Pass-thru)
  11. Best Practices • Don’t forget about “R” in REST –

    JSON isn’t the only data format • URL should be easy to understand – Avoid inventing mini- query language • Resources should be easy to discover • Ideally every resource address should allow reading and writing • Avoid query string to address data 14
  12. Use Case: Bank - Regulatory Risk Management • Federal Reserve

    CCAR • Basel Independent Review • Supervisory Formula Approach (SFA) • Dodd-Frank Annual Stress Test 16
  13. 2015, Global Bank Upwards 50% of my time goes into

    data reconciliation efforts. “ The biggest pain is sharing data between Python, R, etc. The problem is - there should be one specified entry point for data. Consistency of column names and possible values between different versions of the data. There are a lot of holes in the data process. I think the #1 priority would be creating a good schema. ” Finding what you need in this zoo. (…) Currently this is done by talking to people! 17
  14. Data Science Process 18 • Data acquisition, storage, discovery and

    mining, statistical learning, machine learning, predictive analytics, risk modeling • Competency chasms at every step
  15. Implemetation: SlashDB API 19 Model Research & Dev. use any

    programming language Reports & Visualization deliver now, anticipate future Unobstructed Data Sharing standard formats, HTTP delivery Disparate Data Sources loan portfolios, macroeconomic data, risk metrics, market data Automatic, multi-representational, resource-oriented, hypermedia and search engine friendly data API & cache.
  16. Resource Oriented API Solves Many of the Issues • Single

    access point that’s easy to work with • Combines the best features of plain files (simplicity) and databases (data integrity) • Has authentication, authorization and encryption • Pragmatic data access for people and programs • Search engine ready 20
  17. Searchable API 21 • Users know what they need, but

    may not know where to find it • True hypermedia API should contain hyperlinks to related resources • Search engine crawl/index is trivial when all resources are hyperlinked • Try it yourself at: http://demo.slashdb.com/search.html (i.e. search for: “customers from Brazil”)
  18. Resource Oriented API is a Sensible Investment • Multiply returns

    on investments already made in databases (the other ROA) • Avoid pitfalls of file-based data sharing • Avoid dangers of direct database access • Avoid opaqueness of ESB, RMI, SOAP, CORBA, etc., etc. • Attract top developers (they want to work on cool stuff, and they don’t know databases) 22
  19. Credits & References • S&P Churn 2002-2012 “Creative Destruction Whips

    through Corporate America” by Richard Foster, Innosight http://www.innosight.com/innovation-resources/strategy-innovation/upload/creative-destruction-whips-through-corporate-america_final2015.pdf • 2002 at Amazon “The Secret to Amazon’s Success Internal APIs” by Kin Lane, API Evangelist http://apievangelist.com/2012/01/12/the-secret-to-amazons-success-internal-apis/ • Flattening the Competition Google Finance, chart prepared by V. Olex https://www.google.com/finance?q=amzn • 2015 Global IT Spending “Want money for that new project? Then it's time to go on a moose hunt” by Steve Ranger, ZDNet http://www.zdnet.com/article/want-money-for-that-new-project-then-its-time-to-go-on-a-moose-hunt/ • SaaS Revenue Projections “Enterprise software spend to reach $620 billion in 2015: Forrester” by Natalie Gagliordi, ZDNet http://www.zdnet.com/article/enterprise-software-spend-to-reach-620-billion-in-2015-forrester/ • What is Resource Oriented Architecture Wikipedia http://en.wikipedia.org/wiki/Resource_oriented_architecture • Data & Analytics: Benefits & Challenges “5 Insights & Predictions On Disruptive Tech From KPMG's 2015 Global Innovation Survey” by Louis Columbus http://www.forbes.com/sites/louiscolumbus/2015/11/08/5-insights-predictions-on-disruptive-tech-from-kpmgs-2015-global-innovation-survey/ • Data Science Process https://en.wikipedia.org/wiki/Data_science • Other graphics Photographs of D. Trump, Flicker and public domain sources • Logos and other trademarks are the property of their respective owners; used here for illustration purposes only, no association or endorsement implied. 24