Arcitecta - IT Press Tour #64 Oct 2025

October 6th, New York City IT Press Tour 2025

Copyright © Arcitecta Pty. Ltd. Agenda • Jason opening o
State of the company o Future direction • Eric Follows o What we have been up to since we last met o Big wins we have had o Deep dive into customer examples (Princeton, DFCI, TU Dresden, IWM) o Look at customer challenges o Datakamer Event • Q&A and Discussion

Mediaflux Data Platform For Anyone With Data

Copyright © Arcitecta Pty. Ltd. Core Capabilities: Converged Data Management
Data orchestration & management Mediaflux provides a rich policy engine to automate data workflows (ingest, transform, distribute) across all sites.

Copyright © Arcitecta Pty. Ltd. Mediaflux Data Management Platform Converges
data management, orchestration, multi-protocol access, and storage in a single system. This isn’t just a file system – it’s an end-to-end data fabric. Global data lifecycle management Mediaflux manages the entire data lifecycle, from ingestion through to long-term archive, both on-premises and in the cloud. Multi-protocol, scalable access Purpose built XODB supports common access methods (NFS, SMB, S3, SFTP, etc.), so any application or AI tool can read/write data seamlessly. One vendor-agnostic fabric Mediaflux works over any storage hardware: NetApp, Dell ECS, cloud blob stores, tape, you name it.

Copyright © Arcitecta Pty. Ltd. Advanced Features: Metadata, Compute, and
Transfer Compute-to-data Mediaflux can send analytics and processing jobs to where data lives. Rich metadata engine XODB database drives a powerful search. This world-class metadata catalog ensures you instantly know what data you have and how it’s connected. Ultra-fast WAN transfers Mediaflux’s Livewire module provides integrated WAN acceleration. It can move data globally at up to 95% of link speed.

Copyright © Arcitecta Pty. Ltd. 2024 Leader for Unstructured Data
Management

Copyright © Arcitecta Pty. Ltd. Core Capabilities: Byte Sized Pieces
of Mediaflux Global namespace Offers either a single unified file system and/or federated namespaces for geographically dispersed teams. Edge caching To meet low-latency needs, Edge nodes cache hot data close to users while keeping a copy centrally. Burst compute support For peak AI/compute spikes, Mediaflux Burst lets you extend compute to cloud or other data centers on demand, decoupling storage from compute resources. Real-Time Collaboration Mediaflux Real-Time ensures that no moment is lost—giving teams near instant access to growing files, seamless collaboration, and ultra-fast data transfers.

Mediaflux Accelerates AI with a Unified Data Fabric A Unified,
AI-Ready Data Infrastructure that Supports All Forms of Data and AI Models

Copyright © Arcitecta Pty. Ltd. Built-in Intelligence: XODB Vector &
AI Integration XODB vector database Mediaflux XODB not only stores file metadata, but also manages vectors between data objects. By understanding spatial/temporal Mediaflux can place or replicate data to facilitate AI models. RAG and generative AI Mediaflux is architected for the next-generation AI stack. XODB supports vector embedding representations, therefore it can serve as a backend for AI agents. Seamless search and hierarchy Mediaflux can present virtual hierarchies of data based on search results or metadata filters. The net result is that your AI applications gain an “all- knowing” index of the enterprise data.

Copyright © Arcitecta Pty. Ltd. Challenges Library function moving from
an analogue to a digital domain. Old library functions were about preserving books, films, tape recordings, curation, sorting etc. Princeton is working on a 100-year data management plan. Think about how many technology refreshes there will be over a 100-years? How will a Nobel prize winning researcher find their 2024 data in 2040?

Copyright © Arcitecta Pty. Ltd. • Expanding Arcitecta’s Partnership Ecosystem
o IBM o Dell Technologies • Mediaflux is the core technology for TigerData • 200PB of research data managed by Mediaflux

User Interface Middleware Heterogeneous Storage IBM Diamondback Tape Library NFS/SMB
Mount Points Web Portal

Copyright © Arcitecta Pty. Ltd. Departmental Storage Redundant 2nd Copy
(Tape) Scratch Working Persistent Low-Use High performance computing need Only available within the clusters Code and data in active use Single copy, no backups Performant for non- HPC analysis Required data management Ability to mount to HPC head nodes Code and data tied to active research Redundant/Resilient Moderate performance for review Required data management Independent of clusters Code and data tied to research Redundant/Resilient Minimal performance needs Required data management Independent of clusters Code and data identified for long-term preservation, possibly publication Redundant/Resilient Current Day 497,147,731 total assets managed by TigerData

(Tape) Scratch Working Persistent Low-Use High performance computing need Only available within the clusters Code and data in active use Single copy, no backups Performant for non- HPC analysis Required data management Ability to mount to HPC head nodes Code and data tied to active research Redundant/Resilient Moderate performance for review Required data management Independent of clusters Code and data tied to research Redundant/Resilient Minimal performance needs Required data management Independent of clusters Code and data identified for long-term preservation, possibly publication Redundant/Resilient 1 Year (December 2025)

(Tape) Scratch Working Persistent Low-Use High performance computing need Only available within the clusters Code and data in active use Single copy, no backups Performant for non- HPC analysis Required data management Ability to mount to HPC head nodes Code and data tied to active research Redundant/Resilient Moderate performance for review Required data management Independent of clusters Code and data tied to research Redundant/Resilient Minimal performance needs Required data management Independent of clusters Code and data identified for long-term preservation, possibly publication Redundant/Resilient 3 Year (December 2027)

Copyright © Arcitecta Pty. Ltd. DFCI • Expanding Arcitecta’s Partnership
Ecosystem o Spectra Logic o Wasabi • Mediaflux is the tool used to manage and share data within DFCI and with outside organizations • Spectra Cube with BlackPearl • Wasabi cloud storage

Researcher Overland Street Storage Current State – Current Data Center
Architecture • Primary Disk storage at Overland Street • Secondary Disk (copy) at Overland Street • Mediaflux Management Layer Process • Data written to primary and secondary at same time • Data manually designated for archive are sent to Amazon S3 Glacier Deep Archive 25 Data Secondary Disk Primary Disk Marked for Archive Glacier Deep Archive Mediaflux

Storage Migration Phase 1 – Markley Boston Architecture • Primary
Disk storage at Overland Street • Secondary Disk (copy) at Overland Street • NEW: High availability tape being deployed at Markley Boston Process • Data written to primary, secondary at same time • High availability tape converted to secondary storage • Archive available upon request in Amazon S3 Deep Glacier 26 Overland Street Primary Disk Glacier Deep Archive Boston High Availability Tape Secondary Disk Researcher Data Mediaflux

Storage Migration Phase 2 – Markley Lowell 27 Markley Lowell
Primary Disk Glacier Deep Archive Boston Second copy in High Availability Tape Architecture • Primary Disk storage at Markley Lowell • NEW: Secondary (copy) stored in High Availability tape at Markley Boston Process • Data written to primary and secondary at same time • Files automatically sent to Amazon S3 Deep Glacier Archive upon request Researcher Data Mediaflux

Next Steps • Working with research groups to: • Determine
Future Storage (size) needs • Determine what other storage devices/services they are utilizing with the goal of consolidating most/all data into RCSM • Work with them to create more efficient workflows for moving data • Work with vendors to evaluate best options for future storage hardware balancing price and support • Work with peer institutions to collaborate on future solutions for all aspects of Research computing 28

Copyright © Arcitecta Pty. Ltd. Challenges Before Mediaflux • Explosive
data growth into petabytes • Fragmented, siloed storage systems • Physical and digital assets • GLAM sector is a neglected area for technology innovation

Copyright © Arcitecta Pty. Ltd. The Solution Mediaflux DAMS with
Wasabi • Centralized visibility across all storage tiers • East to use interface with Mediaflux DAMS • Automated, seamless tiered archiving (disk → cloud) • Integration with Wasabi AIR for vector embedding

Copyright © Arcitecta Pty. Ltd. Impact & Benefits Mediaflux provided
NFSA with a modern way to manage and ingest content • Simplifying complexity • Reducing costs • Easy to use and intuitive • Providing AI ready data and vector embeddings that can be searched across along with metadata

Copyright © Arcitecta Pty. Ltd. Challenges Before Mediaflux • Explosive
data growth into petabytes • Fragmented, siloed storage systems • Hard-to-find archived data; slow retrieval • Inefficient, manual archiving workflows • Limited collaboration across research teams

Copyright © Arcitecta Pty. Ltd. The Solution Unified Data Management
with Mediaflux + GRAU DATA XtreemStore • Centralized visibility across all storage tiers • Metadata-driven indexing for easy discovery • Automated, seamless tiered archiving (disk → tape) • High scalability and performance for research workloads

Copyright © Arcitecta Pty. Ltd. Results & Benefits • Optimized
workflows: Automated archiving saves time • Seamless access: Even tape-stored data stays searchable • Better collaboration: Central repository boosts data sharing • Lower costs: Cold data moved to low-cost storage tiers • Future-ready: Scalable, adaptable infrastructure

Copyright © Arcitecta Pty. Ltd. Impact Mediaflux transformed TU Dresden’s
data management from a bottleneck into a competitive advantage by: • Simplifying complexity • Reducing costs • Boosting collaboration and performance • Enabling research at scale

Copyright © Arcitecta Pty. Ltd. Challenges Faced by IWM •
Diverse asset types: images, video, documents • Need for scalability: petabytes of storage • Distributed operations across sites • Security and compliance requirements • Ease of use for both archivists and casual users

Copyright © Arcitecta Pty. Ltd. The Solution • Custom workflows
for ingestion, cataloging, distribution • Automated metadata extraction for discovery • Multi-site deployment with redundancy • Military-grade security and compliance • User-centric design with intuitive interfaces • Future-proof scalability and modular architecture

Copyright © Arcitecta Pty. Ltd. Results & Benefits • 40%
reduction in manual tasks via automation • Researchers find assets 50% faster • Centralized governance improves consistency • Enhanced preservation with metadata-driven workflows • Reduced costs by consolidating legacy systems • Future-ready digital asset management platform

Copyright © Arcitecta Pty. Ltd. Impact Mediaflux revolutionized IWM’s digital
asset management by: • Streamlining operations • Securing and preserving digital collections • Reducing costs and improving access • Empowering staff and researchers with intuitive tools • Future-proofing IWM’s digital preservation efforts

Mediaflux A platform that helps relieve some of the burden
managing data The last data management platform you will ever need.

Copyright © Arcitecta Pty. Ltd. • High speed low latency
parallel data transfer solution for data at scale • Securely share data with people inside and outside your organization • Migration ability to move to / or implement new storage • Replication of data to a second site for protection • Difficult and time consuming to transfer the data I have to where it is needed in my workflow • I must collaborate with teammates located all over the world and outside my organization • I need to migrate storage to a new system • I need to create / adhere to a DR plan Customer Pain Point Rx for Pain Challenge 1: Sharing / Transferring data is difficult and takes a long time to complete making it a major bottleneck in my workflow.

Copyright © Arcitecta Pty. Ltd. Challenge 2: Need for a
single view (and access point) into what data I have and what is generating it / I need insights of my data • Find, mine, understand, utilize your data • Centralize and consolidate all your data into a single view. One pane of glass into all your data • Ability to utilize existing hardware and software already being used • I don’t know or understand what data I have and who or what is generating data • I have orphaned (siloed) storage and files that need to be centralized • I need a solution that fits into my existing workflow and can integrate with current technology I have Customer Pain Point Rx for Pain

Copyright © Arcitecta Pty. Ltd. • Ability to grow and
keep pace with new equipment generating vast amounts of data • Being in the data path • Remove Vendor Lock-In • Data is being generated too fast for existing storage solution to manage = large amounts of unstructured data that needs management • I need my users and applications to maintain access to data regardless of where or what its stored on • Don’t want to be tied to a single vendor Customer Pain Point Rx for Pain Challenge 3: Data is being generated at an increasing pace and I need to store it and leave it accessible to users and applications.

Copyright © Arcitecta Pty. Ltd. What was Datakamer • Community
building • Panel discussions • Establishing best practices • Learning from each other • Low pressure, BOF type event • Thought leadership focused

Copyright © Arcitecta Pty. Ltd. Datakamer Future • Looking to
host events around the world o Europe o Australia o Princeton University • Moderator for future events • Momentum building

Copyright © Arcitecta Pty. Ltd. Mediaflux Next 6 Months •
Vector database expansion • Tools to Streamline Deployment • Bug fixes and product stability

Copyright © Arcitecta Pty. Ltd. Where to See Arcitecta Next
• Supercomputing 2025 • Mediaflux Users Group meeting • SC Asia • Data Week – with Wasabi • eResearch – with IBM • NAB 2026 – with Dell

Thank You

Arcitecta - IT Press Tour #64 Oct 2025

Arcitecta - IT Press Tour #64 Oct 2025

More Decks by The IT Press Tour

Other Decks in Technology

Featured

Transcript