Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Datameer - IT Press Tour Jan 2021

Datameer - IT Press Tour Jan 2021

The IT Press Tour

January 26, 2021
Tweet

More Decks by The IT Press Tour

Other Decks in Technology

Transcript

  1. © 2021 Datameer, Inc. All rights reserved. Employees Investors: Technology

    Partners: Headquartered in SAN FRANCISCO, offices in NYC, LONDON, BERLIN and HALLE 10+ years helping customers with their ‘ocean of data’ a.k.a ‘data-meer’ Datameer (~50% R&D) Founded in Total funding
  2. © 2021 Datameer, Inc. All rights reserved. VP Marketing Head

    of People CEO VP Product VP Engineering SVP Sales The Team
  3. © 2021 Datameer, Inc. All rights reserved. Complex & Fragmented

    Data Architectures complexity time Large mature enterprises have data landscapes that have become exponentially more complex and fragmented over the years with each introduction of new enterprise data management technology. Data marts Data warehouses On-Prem Data Lakes Cloud Object-Stores Cloud Data Warehouses
  4. © 2021 Datameer, Inc. All rights reserved. Searching & Prepping

    73% of time wasted on searching and prepping data - IDC Lack of Trust 60% of executives are not very confident in their Data & Analytics insights. -Forrester Unused More than 60% of data in an enterprise goes unused for analysis. - Forrester Effort Duplication 20% of time wasted in reinventing the wheel/recreating analyses that had already been done. - IDC 73% 60% 60+% 20% Complex architectures of mature enterprises make finding, accessing, trusting, and leveraging all data in a timely fashion for business decisions quite challenging. Hard to find, access, trust & leverage ALL data
  5. © 2021 Datameer, Inc. All rights reserved. Datameer is a

    Cloud Data Platform powering the full data life cycle from discovery, access, and transformation of data from disparate data sources to cataloging analytic artifacts and providing a framework for sharing results and context among analytic teams. It unlocks and extends the value of Cloud Data Warehouses by lowering the friction between controlling a modern data environment and accelerating analytical outcomes. Datameer Cloud Platform
  6. © 2021 Datameer, Inc. All rights reserved. Datameer Cloud Platform

    Cloud ELT+++ • Operationalized • Complex Transform • PII, Governance • Security Analytics Platform • Virtualized Access • Discovery & Semantic Layer • Collaboration • Full Catalog Cloud DWH Transform & Move Data Virtualized & Push-down Queries Unified asset catalog Search / discover across environments Exchange / synchronize metadata Cloud & On-prem Data Sources Analytics: BI & Data Science
  7. © 2021 Datameer, Inc. All rights reserved. The Platform Opportunity

    Transformations First-class Access Layer Query interface and external tool integration Data Discovery Powerful catalog & search Data Exploration Metadata, custom metadata, data profiles Connectivity Rich Semantic Layer Custom attributes, Business glossary, Annotations, Tags APIS & SDKs Powerful automation and integration APIs, customisation SDK Scalability Yes Productionalization Fully operationalizable Collaboration Re-usable assets, comments, notifications, business workflows Governance Rich: Lineage, Audit, Custom Roles and many more Virtualization Yes, including query push down compute ETL Powerful pipeline design and management Security Strong Spectrum Spotlight / Best of two worlds One platform with first class features and customized UX for a variety of personas in individual, but integrated modules.
  8. © 2021 Datameer, Inc. All rights reserved. Integration Exploration Access

    Spectrum Spotlight Cloud Data Warehouse Elastic Cloud Compute User Management & Access Control 200 Connectors On Prem & Cloud Data sources Monitoring Sampling, Scheduling & Runtime Pipeline Builder Transformations Data Management Enterprise Governance, PII Handling & Security APIs / Webhooks Resilience Semantic Layer, Metadata, Business Glossary Catalog, Search & Discovery Unified Virtual Access Cache & Policies Collaboration View Modeling BI Data Science Queries Virtualized & Push-down Metadata RDBMS Credential Store
  9. © 2021 Datameer, Inc. All rights reserved. Empower Teams. Save

    Costs. Govern Data Empower Teams Business users regardless of technical skills can collaborate with data engineers to build data sets tailored to their exact business needs. Reduce analytics lifecycle from months or weeks down to hours. Govern Data Trust the data sets your team are working with full data lineage, business glossary, annotations, data quality tags Reduce data governance chaos by reducing data replication Search, find access any data for GDPR, CCPA for “right to forget”” compliance . Save Costs Data Engineering Labor Costs ETL/ELT software consumption-based costs Cloud compute costs
  10. Datameer Spotlight disrupts the traditional centralized data warehouse model, giving

    organizations highly secure data access at a fraction of the cost. The new flagship product from Datameer upends a three-decade-old approach to data analytics SAN FRANCISCO, California, December 1st, 2020​ - Datameer today announces the introduction of a breakthrough platform, ​Datameer Spotlight​, that flips the traditional central data warehouse paradigm on its head and enables organizations to run analytics at scale in any environment and across data silos at a fraction of the cost. Today’s approach to leveraging data for analytics has remained largely unchanged for almost three decades: Organizations pipe all enterprise data into a centralized data warehouse or data lake in what is an expensive, time-consuming process. Despite company after company failing at this elusive data centralization quest, leaving employees unable to easily find and access the data that’s relevant for their needs, companies such as Oracle, Teradata, and Informatica — then, later on, AWS, Google Cloud, Azure, Snowflake, Talend, and Fivetran — have thrived under this three-decade-old model. Promoting the vision of a single source of truth that delivers a 360-degree view of the customer, vendors have been competing to store a copy of your data in their data centers using their tools. While the advent of the cloud data warehouse brought incremental improvements to organizations by saving them from needing to plan for excess storage and compute on-premises, it hasn’t changed the fundamental “duplicate and centrally store data” paradigm — despite the fact that this approach is unwieldy, costly and leaves enterprises struggling to leverage the full value of their data. Costly & Wasteful Data replication is not free. Whether on-premises or in the cloud, data replication requires storage, tools, and highly-skilled, highly-specialized data engineers to code and maintain complex ​ETL​ scripts. Unfortunately, demand for data engineers has grown 50%, and salaries have increased by 10% year over year, according to Dice.
  11. It also has a non-negligible impact on the environment: nearly

    10 million data centers have been built over the last decade, according to IDC. Now, data centers have the same ​carbon footprint​ as the entire aviation industry pre-pandemic. Lengthy & Unwieldy Business users need instant access to data to make real-time business decisions. Current batch ETL processes​ for moving data don’t give users the instant access they need. Making matters worse, it takes days, weeks, and sometimes months to initially set up a data pipeline. Data pipelines’ specifications can also get lost in translation between the business domain experts and the data engineers who build them, complicating things further. What’s more, business users don’t always know what transformations, cleansing, and manipulation they’ll want to apply to the data, and having to go back and forth with data engineers makes the discovery process very cumbersome. Hadoop was designed to solve this issue with schema on read. But the complexity of the technology combined with the still monolithic data lake model doomed this ecosystem. Governance & Security Risks Replicating data via data pipelines comes with its own regulatory, compliance, and security risks. The centralized data approach gave IT teams the illusion of tighter control and ​data governance​. However, this approach backfired. With data sets never exactly meeting business needs, different teams began to set up their own data marts, and the proliferation of these only exacerbated ​data governance​ issues. Sunk Costs & Throwing Good Money after Bad Over the years, organizations have made significant investments to build their version of the enterprise data warehouse. And despite these projects falling short of their promises, organizations have been committing what economists call the sunk cost fallacy by throwing more money at them, in an attempt to fix them, e.g., recruiting more specialized engineers and buying more tools vs. looking for alternative approaches and starting anew. Enterprises will, for example, move some of their data to the cloud on AWS, Azure, Google, or Snowflake on the promise of faster, cheaper, more user-friendly analytics. Migration projects are rarely 100% successful and often result in more fragmented data architectures that make it harder to perform analytics in hybrid or multi-cloud environments. Businesses might purchase Alteryx, for example, to enable domain experts to transform data locally on their laptop, contributing to more data chaos and the proliferation of ungoverned data sets. After that, they’ll purchase a data catalog, to index that data and help business users find it. On top of that, the IT team will want to
  12. invest in tools to add a layer of governance for

    peace of mind. Data stacks often end up thrown together like the​ Winchester Mystery House​, becoming a money pit for enterprises. And yet despite these massive investments: • 60% of executives aren’t very confident in their data and analytics insights (Forrester) • 73% of business users’ analytical time is still spent searching, accessing, and prepping data (IDC) • More than 60% of data in an enterprise is not used for analytics (Forrester) Datameer solves these data challenges with its latest product ​Datameer Spotlight​,​ a virtual semantic layer​ that embraces a distributed data model—also known as ​data mesh​. With over ​200 connectors​ and counting, ​Datameer Spotlight​ provides business end-users virtual access to any type of on-prem or cloud data sources—including data warehouses, data lakes, and any applications—and lets them combine and create new virtual data sets specific to their needs via a visual interface (or a SQL code editor for more advanced users), with no need for data replication. The data is left in place at the source. This ​new approach​ solves for: • Data governance:​ data remains at the source, and no data replication is needed. • Cost: ​with no need for ETL tools, a central data warehouse, data cataloging, a data prep tool, data engineering, or middle man between the data and the end-user, the solution ends up at a fraction of the cost of traditional approaches. • ​Speed and agility:​ connecting Datameer Spotlight to a new data source, takes as much time as entering your credentials to this data source. Once connected, business users can create new datasets across data sources in a few clicks. • Data discoverability:​ it virtualizes your data landscape by indexing the metadata of every data source and creating a searchable inventory of assets that can easily be mined by analysts and data scientists—all without moving any data. Ready to give Datameer Spotlight a test drive? ​Try it for free here​.