Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ZeroPoint Technologies - IT Press Tour #57 Sep....

ZeroPoint Technologies - IT Press Tour #57 Sep. 2024

The IT Press Tour

September 04, 2024

More Decks by The IT Press Tour

Other Decks in Technology

Transcript

  1. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. 2-4x Data Compression 50% More Performance per Watt 1,000x Faster than alternatives Klas Moreau, CEO [email protected]
  2. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Team | Technically strong management with industrial semiconductor experience, leading a team of 30 Klas Moreau CEO • Progressive leader, experienced in building global companies • 25-year broad background in semiconductors, computer games, and AI Nilesh Shah VP of BD (US) • 20 years at Intel in various technical and business development positions • Substantial GtM experience of memory technology Prof. Per Stenström CSO & Co-founder • Internationally renowned memory expert • Senior industry experience at Sun Microsystems with a wide industry network Dr. Angelos Arelakis CTO & Co-founder • Expert in memory architecture and ultra-fast data compression • Winner of the Swedish King’s 50th anniversary foundation award Team • Spinout from Chalmers University of Technology • Swedish company with HQ in Gothenburg and presence in California • Team of 30 FTEs with expertise in memory architecture, ASIC/FPGA development, and SW
  3. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Problem | 3 reasons why memory performance must be addressed 1. Memory is a bottleneck • It has developed much slower than processors • Latest TSMC node yielded no improvement in memory • Processors now idle up to 50% of time, waiting for memory access 2. Memory is inefficient • 95% of data stored is unnecessary • 30% of the memory transactions can be avoided 3. Memory is expensive • Memory is a large and increasing part of both CAPEX and OPEX in data centres • The carbon footprint of servers largely depend on the embodied carbon of hardware and power to run and cool it
  4. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Problem | The memory gap has become a bottle-neck which becomes further exposed by the rise of AI 1 10 100 2010 2012 2014 2016 2018 2020 2022 Performance index (2010=1) ~15x Memory has developed much slower than processors CPU-performance DRAM bandwidth performance GPT2 GPT3 GPT4 The AI-explosion fuels demand for computing power and memory 10^17 10^18 10^19 10^20 10^21 10^22 10^23 10^24 Training compute (FLOP) 13 14 15 16 17 18 19 20 21 22 23 5-6x/year
  5. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. 1. Improves performance by increasing memory utilization • 2-4x data compression, leading to increased capacity and bandwidth • 50% higher performance/watt • Up to 80% reduction in processor idle time Solution | Hardware accelerated memory compression across the memory hierarchy 3. Reduces TCO and TCCO significantly • 20-25% reduction in total cost of ownership (TCO) of servers • 25-30% of total cost of carbon ownership (TCCO) of servers, leading to up to 20 megatons of CO2-savings 2. Efficient usage of memory and CPU cycles • Lossless compression, keeps all relevant data and discards the rest • No need to waste CPU-cycles on compression
  6. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Solution | ZeroPoint help reduce Server TCO by 20-25%, with potential to create tens of billions USD of value Today With CXL only With our compression only With CXL & Our compression ~850 ~780 ~700 ~650 -20-25% CAPEX: CPU CAPEX: Memory CAPEX: Other OPEX: Power OPEX: Cooling OPEX: Other Waste: Compression tax Total Cost of Ownership (TCO) for 40 server rack over 3y lifetime [kUSD] With >10m servers shipped per year, the potential yearly value created with ZeroPoint’s IP is in the order of $10-100B for the Server segment alone
  7. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Technology | 2-4x general purpose compression that is lossless and works on cache-line granularity Data Compaction ZeroPoint Proprietary Algorithms • Real-time, high performance and low latency • Cache line granularity Memory Management ZeroPoint Developed Driver • Transparent to operating system and application • Hardware accelerated Data Compression ZeroPoint Proprietary Algorithms • Ultra-fast, Deterministic low latency, suitable for Inline compression • 2-4x General purpose and Lossless compression
  8. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Before Data A Data B Data C Data D Data E Data F Data G Data H Data I Redundant data Useful data
  9. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Before Data A Data B Data C Data D Data E Data F Data G Data H Data I Redundant data Useful data
  10. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Data A Data B Data C Data D Data E Data F Data G Data H Data I 1. Compress Before Data A Data B Data C Data D Data E Data F Data G Data H Data I Redundant data Useful data
  11. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Data A Data B Data C Data D Data E Data F Data G Data H Data I 1. Compress Data A Data B Data C Data E Data F Data G Data I Data D Data H 2. Compact Before Data A Data B Data C Data D Data E Data F Data G Data H Data I Redundant data Useful data
  12. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Data A Data B Data C Data D Data E Data F Data G Data H Data I Data A Data B Data C Data E Data F Data G Data I Data D Data H More memory More memory More memory More memory More memory More memory 1. Compress After Data A Data B Data C Data E Data F Data G Data I Data D Data H 2. Compact 2-4x More memory Before Data A Data B Data C Data D Data E Data F Data G Data H Data I Redundant data Useful data
  13. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. 10,000 ns 1,000 ns 100 ns 10 ns CXL- memory Storage Main memory Cache (L3)
  14. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. 10,000 ns 1,000 ns 100 ns 10 ns CXL- memory Storage Market leading SW-compression Main memory Cache (L3)
  15. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. 10,000 ns 1,000 ns 100 ns 10 ns CXL- memory Storage Market leading SW-compression HW-implementation Main memory Cache (L3)
  16. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. 10,000 ns 1,000 ns 100 ns 10 ns CXL- memory Storage Market leading SW-compression HW-implementation Main memory Cache (L3)
  17. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. 10,000 ns 1,000 ns 100 ns 10 ns CXL- memory Storage Market leading SW-compression HW- implementation Main memory Cache (L3) 1 ns 1,000x
  18. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Only solution with 2-4x compression that works across the memory hierarchy
  19. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Solution | We address the entire memory hierarchy Memory hierarchy ZeroPoint applications On-chip memory Off-chip memory Storage
  20. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. DDR Accelerator CXL Products | Addressing the memory bottle neck in Server xPUs – On-chip SRAM L3$ CPU CPU CPU CPU CPU CPU CPU CPU Cache-MX Need more on-chip cache for your CPU? Cache-MX doubles the cache capacity with only 10% additional area Scratch pad NPU NPU NPU NPU NPU NPU NPU NPU Scratch-MX Need more on-chip SRAM capacity for your NPU? Scratch-MX doubles the scratch pad capacity with only 10% additional area
  21. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. DDR Accelerator CXL Products | Addressing the memory bottle neck in Server xPUs – Off-chip DRAM SphinX DRAM ctrl ZiptilionTM SuperRAM Need a faster, energy efficient swap? Offline SW-based compression steals precious CPU cycles. SuperRAM provides a high performance and energy efficient hardware accelerated compression service. Need more DDR bandwidth? System memory bandwidth is insufficient not meeting the requirements of high speed and data heavy applications. Ziptilion-BW provides 25% more DDR bandwidth at nominal speed and power. Enabling a significantly more performance and energy efficient xPU. Secure DRAM memory AES-XTS industry standard main memory encryption with dynamic keys Need more memory capacity, double the CXL connected memory! DenseMem doubles the CXL connected memory.
  22. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. DDR Accelerator CXL Products | Addressing the memory bottle neck in Server xPUs – Storage SphinX Secure DRAM memory AES-XTS industry standard main memory encryption with dynamic keys Look aside accelerator zstd or LZ4 hardware accelerated look aside compression @host Offline ultra-fast decompression of the zstd open standard. Or symmetrical compression/decompression with LZ4
  23. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Products | Addressing the memory bottle neck in Server xPUs Product overview. Flash Expansion Main Memory Storage ZeroPoint technology Security DenseMem 2-4x main memory expansion for CXL devices Contract announced. To be taped out beginning 2025. Cache-MX /Scratch-MX 2-4x last-level cache or scratchpad capacity expansion SphinX AES-XTS Security IP An easy to integrate off the shelve AES-128/256-XTS high throughput, low latency server main memory security solution Released – Product available Verified on silicon in a commercial server chip (TSMC 7nm) SuperRAM High performance and low latency hardware accelerated zram/zswap at unmatched power efficiency Released – Product available Verified on silicon in a commercial smart device chip (TSMC 5nm) Ziptilion Bandwidth Up to 50% main memory bandwidth acceleration Look aside de/compression acceleration Ultra-fast de/compression of LZ4 and decompression of the ZSTD open standards
  24. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. GPU CPU(1) L2$ L1D $ L1I$ CPU(2) L2$ L1D $ L1I$ CPU(3) L2$ L1D $ L1I$ CPU(...) L2$ L1D $ L1I$ GPU Cache-MX DRAM ctrl ZiptilionTM CPU(n) L2$ L1D$ L1I$ AI ISP Media L3$ / SLC LPDDR Modem Other SuperRAM Addressing the memory bottle neck in Smart device SoCs Need more on-chip cache? Cache-MX doubles the cache capacity with only 10% additional area. Addressing the L2$, L3$, SLC scaling challenge. Need a faster, energy efficient swap? Offline SW-based compression steals precious CPU cycles. SuperRAM provides a high performance and energy efficient hardware accelerated compression service. Need more LPDDR bandwidth? System memory bandwidth is insufficient not meeting the requirements of high speed and data heavy applications. Ziptilion-BW provides up to 50% more LPDDR bandwidth at nominal speed and power. Enabling a significantly more performance and energy efficient SoC.
  25. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Addressing the memory bottle neck in Smart Device SoCs Product overview. Super Flash Main Memory Storage ZeroPoint technology Security SuperRAM High performance and low latency hardware accelerated zram/zswap at unmatched power efficiency Released – Product available Verified on silicon in a commercial smart device chip (TSMC 5nm) Ziptilion Bandwidth Up to 50% main memory bandwidth acceleration Cache-MX 2-3x last-level cache capacity expansion
  26. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. What does the industry say about compression?
  27. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB.
  28. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Big data analytics helps organizations unlock insights and make better-informed decisions. Organizations continuously generate data at scale and rely on various compression techniques to alleviate bottlenecks and save on storage costs. To process these datasets efficiently on GPUs, the Blackwell architecture introduces a hardware decompression engine that can natively decompress compressed data at scale and speed up analytics pipelines end-to-end. The decompression engine natively supports decompressing data compressed using LZ4, Deflate, and Snappy compression formats. The decompression engine speeds up memory-bound kernel operations. It provides performance up to 800 GB/s and enables Grace Blackwell to perform 18x faster than CPUs (Sapphire Rapids) and 6x faster than NVIDIA H100 Tensor Core GPUs for query benchmarks. Addressing the big data challenge! https://developer.nvidia.com/blog/nvidia-gb200-nvl72-delivers-trillion- parameter-llm-training-and-real-time-inference/
  29. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Market trends | Hyperscalers now require HW-based compression that only ZeroPoint can provide Data Centers are spending capacity on software-based compression 4.6% * 3% ** CPU cycles used for compression Meta & Google have stated that a Hardware compressed memory tier is a must-have **https://dl.acm.org/doi/abs/10.1145/3579371.3589074 *https://ieeexplore.ieee.org/document/10158161
  30. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Yesterday we learned that Meta and Google were pushing the hardware compressed CXL memory tier at OCP/CMS: https://www.opencompute.org/wiki/Server/CMS "Meta/Google Contrib- Draft under review - Hyperscale CXL Tiered Memory Expander for OCP - Base Specification" They have provided a lengthy background of the benefits in the specification, but in short: This will help them to address the memory wall at a reasonable investment with great energy efficiency. Meta and Google were pushing the hardware compressed CXL memory tier at OCP/CMS
  31. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. The joint solution increases GPU utilization by 77% and more than doubles the speed of OPT-66B batch inference. The results of the demonstration were impressive. The FlexGen benchmark, utilizing tiered memory, completed tasks in less than half the time compared to conventional NVMe storage methods. Simultaneously, GPU utilization soared from 51.8% to 91.8%, thanks to the transparent management of data tiering across GPU, CPU and CXL memory facilitated by MemVerge Memory Machine X software. https://memverge.com/gtc2024/ MemVerge and Micron Boost NVIDIA GPU Utilization with CXL® Memory
  32. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. When it comes to keeping pace with CMOS scaling, SRAM has fallen flat, with consequences for power and performance. The inability of SRAM to scale has challenged power and performance goals forcing the design ecosystem to come up with strategies that range from hardware innovations to re-thinking design layouts. At the same time, despite the age of its initial design and its current scaling limitations, SRAM has become the workhorse memory for AI. A way to mitigate the impact of poor SRAM scaling is to introduce compression of on chip SRAM What to do when SRAM stopped scaling? • The smaller the bits, the less area you need, and the more bits you can fit on a chip/wafer/ through your fab. • Bit sizes are measured in F2 -- the smallest feature you can create. • F2 is a function of the memory technology, not the manufacturing technology. https://semiengineering.com/sram-scaling-issues-and-what-comes-next
  33. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Technology | There are multiple differentiators to other solutions, that give ZeroPoint an unfair advantage Quantitative differentiators • 2-4x memory expansion More capacity in same amount of HW • 50% higher performance/watt More bang for your buck • 1,000x faster than alternatives Only solution fast enough for Cache, Main, and CXL Qualitative differentiators • Cache-line granularity High performance compression @64B • General purpose and Lossless Efficient and accurate compression • Agnostic to • Architectures • Memory technologies • Processing node (TSMC, Samsung, Intel, etc.)
  34. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. 0 20 40 60 80 2015 2016 2017 2018 2019 2020 2021 2022 2023 Patents Granted Patents Filed Patents Granted and Filed Technology | We have a strong portfolio of 14 patent families protecting our technology, and processes to protect new IP 14 Patent families 38 Patents granted 70 Patents filed
  35. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Business | Sub-segments of priority customer groups Servers Devices CXL xPU Smartphone Tablet Laptop Desktop 1 2 3 3 3 3
  36. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Business | Example of value chain within server segment Hyperscalers CXL controller developers System integrators Processor (xPU) developers Potential customers Main beneficiaries Memory vendors Enterprises
  37. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. GtM | Business model based on royalty fees by licensing IP, with 24-36 month sales processes. • We license our HW IP core and SW driver to SoC customers worldwide • The main incomes is from royalty fees we charge when after tape out (24-36 month from first contact) • This business model is industry standard, with ARM as the most successful example Development License and Design Support Fee Production License Fee Access Licensed IP Select IP Develop SoC Tape out & Manufacture Ship Royalty / produced component Intro 1-3 months Technical and commercial evaluation 6-12 months Commercial development 12-24 months Volume ramp-up 6-12 months
  38. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Market | ZeroPoint is a high-margin business active on a $1.0-1.5B SAM Large market potential for ZeroPoint (2030) Comments • Market estimate is bottoms- up, where volumes and growth rates are based on Gartner data, pricing based on discussions with potential customers • Excluding long-game markets and value from future products due to lack of reliable data • IP-licensing is a high-margin business, with low operating costs and no COGS TAM SAM SOM $350-450M Servers – $150-200M 30% market penetration Devices – $200-250M 30% market penetration $1.0-1.5B Servers – $0.5-0.7B, 26M CPUs/y 2022, 2:1 CXL:CPU, 15-20% CAGR Devices – $0.5-0.8B, ~180M S.P./y 2022, ~150M Tab./y 2022, ~220M PC/y 2022, 10-15% CAGR $6-12B 3-4B xPUs/y 2022, $2-3 per unit, ~15% CAGR
  39. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Market | We are engaged on multiple fronts to drive standardization, integration, and build awareness
  40. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Vision To release the power of computer systems by establishing a new standard of non-waste memory.
  41. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Proof points | There are 5 reasons to trust us Acknowledged industry experts Proven on TSMC 5nm First CXL- customer Technical performance have been proven to hold on silicon Undisclosed customer now integrating ZeroPoint’s compression IP in their upcoming CXL- products Joined ARM's Partner Catalogue due to our improvement of the performance/watt Backed by European Deep Tech investors Co-founders, Prof. Per Stenström & Dr. Angelos Arelakis, are recognized thought leaders within the field
  42. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Financials | Long in-design processes with substantial financial impact once volume production is reached Notable revenue from 2026 onwards Rationale and key assumptions • Strong pipeline of potential customers across all segments, although revenue in next 2 years mainly driven by PoCs • From 2026 onwards, first PoCs will start turning into license fee with royalty-based income, which drives major uptick in projected revenue • Each customer contract at volume production is worth $5-10M/y in CXL-segment and xPU- segment (except AMD and Intel, who are worth in the order of $25-50M/y respectively) Revenue forecast ($M) 2022 2023 2024 2025 2026 2027 2028 ~0 ~0 ~1 ~3 ~6 ~8 ~40 2029 ~110
  43. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. What do others say about ZeroPoint? Moor Insights & Strategy In July, influential Industry Analyst Matt Kimball of Moor Insights & Strategy published a research note detailing the critical need for increasingly efficient and high-performance memory technologies, writing: “For enterprises grappling with the twin imperatives of driving performance and reducing costs, ZeroPoint provides a compelling solution. If your memory vendor, CPU provider, or cloud service is not leveraging ZeroPoint’s technology, it’s worth asking why.”
  44. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. “ZeroPoint’s debut is certainly timely, with companies around the globe in quest of faster and cheaper compute with which to train yet another generation of AI models. Most hyperscalers (if we must call them that) are keen on any technology that can give them more power per watt or let them lower the power bill a little.” TechCrunch article What do others say about ZeroPoint? TechCrunch
  45. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Q&A
  46. This document is Proprietary and Confidential. No part of this

    document may be disclosed in any manner to a third party without the prior written consent of ZeroPoint Technologies AB. Klas Moreau, CEO [email protected]