Paradigm4 flexFS - IT Press Tour #68 June 2026

1 1 from Paradigm4 Object-native Parallel Filesystem

2 THE PROBLEM: The typical AI stack has cracks in
its foundation Most workloads speak POSIX AI training, HPC pipelines, analytics engines, and even AI Agents expect files, directories, and low- latency I/O. That's not changing. Low object-storage costs are compelling Object storage is the right economic choice at scale — elastic, cheap, durable. But it brings high-latency and a foreign API. That's not changing either. Data infrastructure is inadvertently making data expensive and hard to use. The gap taxes every workload GPU idle time. Slow pipelines. Over- provisioned file systems. Data Scientists doing plumbing instead of science. Our Customers Faced This Challenge

3 3 Paradigm4 Of course, we wanted it all but
budgets and offerings were limited

4 4 Paradigm4 Previous Industry Attempts to Get It All
“Lift and Shift” Datacenter Storage into Public Cloud • Physical parallel-filesystem storage designs copied onto cloud servers with connected disks • Often managed the same way as on- premise versions • Integrate with cloud object stores via "bolt- on” tier – at best • Throughput directly dependent on deployed capacity, e.g., Lustre, DDN, WEKA, etc. • Don’t satisfy all four needs – especially cost (incl. operational)

5 5 Paradigm4 flexFS: Object-native Parallel Filesystem …an entirely different
approach Leverage hyperscale object store for its strengths and add: • POSIX file system semantics • Low-latency metadata I/O • Tunable, low-latency file I/O • Minimal operational overhead • Outstanding price-performance You can have it all! Hyperscale object storage brings: Massively scalable, elastic capacity Separately scalable, elastic throughput Low costs But also comes with: High-latency file and metadata I/O Its own APIs – most software knows about files, not objects

6 6 Paradigm4 flexFS Architecture Public Cloud (AWS, Azure, GCP,
OCI, etc.) flexFS Compute Instances (w/ flexFS client) Metadata File Data flexFS Proxy Group (WB cache) 3 2 4 Object Storage (S3, Azure Blob, etc.) flexFS Metadata Server(s) 1

7 7 Paradigm4 Supported flexFS Configurations • Entire environment within
a single cloud region managed by a single cloud vendor Single-Region Cloud • Environment spans multiple cloud regions and/or multiple cloud vendors Multi-Region, Multi-Cloud • All flexFS services and file data are in a private data center, lab, or office On-Premises • Data is primarily stored in the cloud but also needs to be accessible on premises – or vice-versa Hybrid • Storage services co-resident on compute nodes – except object store backend Converged* * Converged configs enable near-local-NVMe performance using networked object storage, as demonstrated on OCI by Oracle working jointly with Paradigm4. For details, see https://blogs.oracle.com/cloud-infrastructure/accelerate-ai-workloads-on-oci-with-flexfs-cache.

8 8 Paradigm4 Enterprise Class flexFS Customers

9 9 9 Paradigm4 Current Use Cases in Production •
Common storage fabric and scratch space • R&D Data Commons • Data staging/munging area • Data Lakehouse Extended Storage • Elastic scratch space for cloud bioinformatics workflows • Direct file access for cloud bioinformatics workflow

10 10 Paradigm4 Top-5 Global Biopharmaceutical Company • Research Data
Commons serves as a global central repo for clinical and other research data Use Case • Existing FS had high admin overhead, high downtime to reprovision, budget overruns Problem/Challenge • Combination of AWS S3, EFS, EBS and FSx for Lustre Competing Solution • 1.14 PB, >160M files and folders Current data Storage • $1.44 million – 59% less than competing AWS solution Cost Savings in One Year

11 11 Paradigm4 flexFS ROI Summary: Top-5 Pharma · Sep
2022 – Mar 2026 (43 months) flexFS + S3 (43 months) $2.53M actual billing + S3 backend AWS provisioned (43 months) $5.65M enterprise tiers, realistic ops Cumulative savings $3.13M 55% of AWS cost avoided 2025 full-year savings $1.44M 59% of AWS avoided Mar 2026 monthly saving $164K/M flexFS $110K vs AWS $274K Lustre over-prov. waste $332K ~24% of FSx spend, 43 mo At current scale (1.14 PB), the entire flexFS+S3 bill ($110K/mo) is less than competing EFS storage alone ($141K/mo) AWS-alternative parameters: Distribution: FSx Lustre 25% · EFS 40% · EBS 10% · S3 25% FSx: Persistent_2 SSD 500 MB/s/TiB @ $0.170/GB-mo · EFS: Standard Regional @ $0.300/GB-mo · EBS: gp3+provisioned @ $0.125/GB-mo · S3: Standard @ $0.023/GB-mo AWS includes: EFS Elastic Throughput I/O ($0.03/GB reads, $0.06/GB writes) · cross-AZ transfer · AWS Backup (FSx 80%, EFS 80%, EBS 60%) FSx provisioned at 30% headroom in 2.4 TiB increments, no shrink after scale-up* · flexFS+S3: actual billing + $0.023/GB-mo S3 backend · Both include i4i.16xlarge @ $4,008/mo * Before AWS launched “Intelligent Tiering” in May 2025. However, scaling Lustre throughput still adds charges – over $.50 for every Mbps – and for SSD read-cache capacity (write caching not supported).

12 12 Paradigm4 Advantages Beyond Cost Savings 🕐 Built-in time
travel Point-in-time recovery at no added cost. AWS Backup for FSx = $0.047/GB-mo; EFS = $0.050/GB-mo. At 1.14 PB scale that's ~$32K/mo if done separately. 🌐 Native multi-AZ — zero transfer cost flexFS serves data across AZs transparently. EFS charges $0.01/GB cross-AZ. FSx Lustre is single-AZ by default. flexFS incurs no cross-AZ transfer charges. ↕ True elasticity — no over- provisioning flexFS grows and shrinks automatically; you pay for bytes used. FSx requires capacity in 2.4 TiB increments and cannot shrink — wasting $332K over 43 months. 🔐 Extended POSIX ACLs for clinical data setfacl/getfacl works uniformly across HPC, AWS Batch, Databricks, and REVEAL. EFS uses NFSv4 ACLs (complex). S3 has IAM/bucket policies only — no POSIX ACLs. ⚡ Proxy group shared caching Reference genomes (30+ GB) cached once across all compute nodes. FSx: each client warms its own cache independently. EFS: no client-side caching layer. 💻 All Linux flavors + macOS Single mount on Amazon Linux, Ubuntu, RHEL, and macOS via NFS re-export. FSx requires the Lustre kernel module — not available on macOS or non-Amazon Linux. 🔗 Single namespace across all services Same /flexfs path from HPC/SLURM, AWS Batch, Databricks, REVEAL, and HTTP. AWS stack requires separate NFS mounts, S3 URIs, and Lustre mounts per service type. 📉 Cost efficiency improves at scale flexFS effective rate fell from ~$90/TB- mo at 25 TB (2022) to ~$66/TB-mo at 1.14 PB (2026). EFS Standard stays flat at $307/TB-mo. FSx stays flat at $174/TB-mo.

13 13 Paradigm4 flexFS Operations Features Deduplication • Identifies duplicate
files within a flexFS volume and optionally replaces them with hard links to reclaim storage. Duplicates are verified through checksum comparison and byte-for-byte validation before making any changes. Optimized find utility • Filesystem search tool that queries the metadata server to locate files and directories matching specified criteria • Similar to the Unix find command but operates directly on the metadata store instead of traversing the mounted volume. Non-Disruptive Updates • Mount clients auto-update in place with a seamless FUSE session handoff — no unmount, no interruption, no data loss. • flexFS server updates pause I/O for less than 1 second – no impact on data. Kubernetes Native • CSI volume driver with Helm chart for dynamic and static provisioning. Mount flexFS volumes directly into pods.

14 14 14 Paradigm4 flexFS cost relief increases as data
volume grows Cost of flexFS vs Lustre and EFS for 100-800 TB

15 15 Paradigm4 flexFS Saves Time and Money Four Ways
• 2-5x cheaper than EFS and FSx for Lustre • Pay only for what you use Save on file storage costs • Big savings on large HPC jobs from reduced file I/O time Save on distributed computing costs • Avoid downtime with high elasticity and no storage-cluster resizing Avoid end user downtime • Minimal infrastructure to monitor and maintain • No re-provisioning to increase capacity Lower operational overhead

16 16 16 Paradigm4 Newer Use Cases • Data Lakehouse
Acceleration • Coupled-Architecture DBMS Modernization • AI/ML Training and Execution Acceleration • Agentic AI/ML Workspace with Persistence

17 17 17 Paradigm4 Data Lakehouse Acceleration Problem High-latency "Metadata
Tax" and "Small-File Congestion" in Spark, Presto, and other OTF Data Lakehouses. The flexFS Advantages • Sub-Second Planning: Metadata service eliminates "planning hangs." • Throughput Saturation: Increased parallelism, more-efficient byte-range requests enable 2X - 7X performance gains on Spark workloads. • Elasticity: Unlike HDFS, flexFS enables compute clusters to scale instantly (e.g., from 500 to 1,000 nodes) without data rebalancing. See “Accelerating Data Lakehouses with flexFS” whitepaper for details. Engine S3 flexFS direct flexFS proxied Spark 1,191s 796s 532s Spark + Comet 788s 1,257s 301s Spark + Gluten 566s 275s 176s TPC-H Results

18 18 18 Paradigm4 Coupled- Architecture DBMS Modernization Objective: Upgrade
Coupled-Architecture databases (MPP DW, Graph and Vector) to use elastic, high-throughput Object Storage via flexFS. Problem: DBMS clusters that currently rely on strict POSIX (atomic renames, locking, etc.) and low-latency, high-throughput I/O on direct- attached disks. flexFS Advantages: • Independent Resource Scaling: De-couple file-storage growth from compute clusters, right-sizing infrastructure and reducing TCO by up to 60% — with no code changes. • Intelligent RAM + NVMe Cache: Proxy Group handles random and burst I/O, speeding results and reducing I/O pressure on the object store. • Local-NVMe-Level Performance: Converged Compute & Proxy config provides throughput and IOPS high-speed engines expect.* • DBMS Snapshots Enabled: leveraging built-in, zero-copy “Time- Travel” metadata mapping that also provides instant, no-cost backups. * As demonstrated on OCI by Oracle working jointly with Paradigm4. For details, see https://blogs.oracle.com/cloud-infrastructure/accelerate-ai-workloads-on-oci-with-flexfs-cache.

19 19 19 Paradigm4 AI/ML Training and Acceleration Objective: Eliminate
GPU Starvation during large-scale model training (PyTorch, TensorFlow, JAX). Problem: High-end GPUs sit idle while standard S3 drivers struggle to feed data fast enough, particularly during random shuffles or massive model checkpoints. flexFS Advantages • 2x Speedup (Non-Proxied): Even without a cache, flexFS can access data in half the time of S3 direct by utilizing Object-Native Parallelism to saturate the network pipe. • Small-I/O Optimization: Byte-Range reads are optimized for GPU utilization and High Bandwidth Memory (HBM) efficiency. • Instant Checkpointing: flexFS absorbs massive model saves at near local-NVMe performance, allowing the GPU cluster to resume training in seconds rather than minutes.

20 20 20 Paradigm4 Agentic Workspace with Persistence Objective: High-performance
AI/ML memory substrate for autonomous agents to rapidly access context, store reasoning, and manage multi-modal artifacts. Problem: Traditional RAG over S3 is too slow for multi-step agents (10+ "hops"). They require low-latency storage for intermediate data – and POSIX provides a more agent-native environment. flexFS Advantages: • Pointers, Not Payloads: Agents share file paths instead of massive data copies. Data cache is "warmed" once for all compute nodes. • Efficient Byte-Range I/O: Agents access only relevant sections of large files (e.g., 500MB PDF) to reduce latency and token costs. • POSIX Scratchpad: Native file-access API environment for agents to save, execute, and log Python scripts, offering a local- disk feel while persisting results to a shared object store or Data Lake.

21 21 Paradigm4 flexFS Fast Time-to-Value, Low Maintenance, Trusted Reliability
• Out-of-the-Box ready to support Agentic AI use cases • Installation typically under an hour • Most customers need only one server • Drop-in replacement for EFS, FSx for Lustre, OCI File Storage, GC Filestore, Azure Files, etc. • Effectively unlimited storage • Very low maintenance – “set it and forget it” • 11 nines data durability on hyperscale cloud • Continuous snapshots • End-to-end data encryption

22 22 Paradigm4 TRY IT FOR FREE https://docs.flexfs.io/getting-started/community/install/

23 23 Widening the Aperture: Questions for you

24 24 24 Paradigm4 What do you think of the
idea of a “File Lakehouse” category? We’re exploring the idea of defining a category in today’s AI/ML/Analytics landscape: the “File Lakehouse” We’d like your thoughts • Is this concept useful? • Would it resonate with your audiences? • Does it create more clarity?

26 26 26 Paradigm4 www.flexFS.io [email protected] Thank you

Paradigm4 flexFS - IT Press Tour #68 June 2026

Paradigm4 flexFS - IT Press Tour #68 June 2026

The IT Press Tour PRO

More Decks by The IT Press Tour

Featured

Transcript

1 1 from Paradigm4 Object-native Parallel Filesystem

2 THE PROBLEM: The typical AI stack has cracks in

3 3 Paradigm4 Of course, we wanted it all but

4 4 Paradigm4 Previous Industry Attempts to Get It All

5 5 Paradigm4 flexFS: Object-native Parallel Filesystem …an entirely different

6 6 Paradigm4 flexFS Architecture Public Cloud (AWS, Azure, GCP,

7 7 Paradigm4 Supported flexFS Configurations • Entire environment within

8 8 Paradigm4 Enterprise Class flexFS Customers

9 9 9 Paradigm4 Current Use Cases in Production •

10 10 Paradigm4 Top-5 Global Biopharmaceutical Company • Research Data

11 11 Paradigm4 flexFS ROI Summary: Top-5 Pharma · Sep

12 12 Paradigm4 Advantages Beyond Cost Savings 🕐 Built-in time

13 13 Paradigm4 flexFS Operations Features Deduplication • Identifies duplicate

14 14 14 Paradigm4 flexFS cost relief increases as data

15 15 Paradigm4 flexFS Saves Time and Money Four Ways

16 16 16 Paradigm4 Newer Use Cases • Data Lakehouse

17 17 17 Paradigm4 Data Lakehouse Acceleration Problem High-latency "Metadata

18 18 18 Paradigm4 Coupled- Architecture DBMS Modernization Objective: Upgrade

19 19 19 Paradigm4 AI/ML Training and Acceleration Objective: Eliminate

20 20 20 Paradigm4 Agentic Workspace with Persistence Objective: High-performance

21 21 Paradigm4 flexFS Fast Time-to-Value, Low Maintenance, Trusted Reliability

22 22 Paradigm4 TRY IT FOR FREE https://docs.flexfs.io/getting-started/community/install/

23 23 Widening the Aperture: Questions for you

24 24 24 Paradigm4 What do you think of the

25 25 Paradigm4 Modern AI/ML/Analytics Architecture: with File Lakehouse Data

26 26 26 Paradigm4 www.flexFS.io [email protected] Thank you