Training African language foundation models with Lelapa AI and AWS Machine Learning Services

© 2025, Amazon Web Services, Inc. or its affiliates. All
rights reserved.

rights reserved. Training African language foundation models with Lelapa AI and AWS Machine Learning Services Nicolas David A I M 3 0 1 (He/Him) Senior Startups Solutions Architect, MEA Amazon Web Services Jade Abbott (She/Her) Co-Founder and Chief Technical Officer Lelapa Ai

rights reserved. (He/Him) Senior Startups Solutions Architect, META Amazon Web Services Nicolas David (She/Her) Co-Founder and Chief Technical Officer Lelapa Ai Jade Abbott

rights reserved. Agenda The Grand Challenge The Business Outcome Taming the Data Beast Q&A Training Foundational Models at Scale & Cost Serving AI to a Continent

rights reserved. The Grand Challenge

The Grand Challenge Low-resource, code-switching, and complex languages break today’s
AI, blocking cross-language communication and cutting millions off from the digital economy Estimated untapped revenue due to the language gap +90% $18.4b Africans don’t speak English at home yet most digital services require it. ...of data on the internet is low resource languages ~1%

The Lelapa Solution Data Models Real World AI Sustainable AI
Quality data creation instead of data extraction using 75% less data Custom-built for multilingual, multicultural performance Code-switched representing how we really speak 75% lower development requirements than leading benchmarks

An API enabling cross language communication between people and systems
that speak different languages Sanibonani friend Goeie More, hello Jambo Goeie More, Hello (agent) (customer) Sanibonani

rights reserved. Taming the Data Beast

Sustainable Framework Step 1: Dataset Curation Local linguists and data
curation teams Step 2: Dataset Creation Native Speakers Contributing Data Step 3: Dataset License Community representatives ensuring fair benefit distribution Step 4: Dataset Release with Sustainable License Step 5: Research & Commercial Use Academic researchers, African AI and Tech Companies Step 6: Fees Received Non-African commercial entities paying licensing fees The Esethu Framework: Reimagining Sustainable Dataset Governance and Curation for Low-Resource Language Data Creation > Data Extraction

Many Smaller Specific Models Task Specific Domain Specific Use Case
Specific Adaptors Focused Data Resource-Efficient Train + Serve Quantization Pipelines

rights reserved. Training Foundational Models at Scale & Cost

rights reserved. Overcoming the barriers to AI and ML adoption Disparate data science tools Accelerate model delivery with open-source FMs available in a hub Ability to create high-quality labelled datasets Managing underlying infrastructure Challenging to govern gen AI and ML projects efficiently Tedious, manual ML operations

rights reserved. DISPARATE DATA SCIENCE TOOLS Overcoming the barriers to AI and ML adoption Integrated ML tools in a single interface Build, train, and deploy models using IDEs ACCELERATE MODEL DELIVERY WITH PUBLICLY AVAILABLE FMS AVAILABLE IN A HUB Choice of FMs Access 250+ FMs that can be customized easily ACCESS TO HIGH-QUALITY LABELLED DATASETS Purpose-built data labeling workflows Create high quality training datasets to improve model accuracy MANAGING UNDERLYING INFRASTRUCTURE Fully managed ML infrastructure Purpose-built accelerators for deep learning training and inference TEDIOUS, MANUAL AIML OPERATIONS Built-in MLOps Automate and standardize MLOps practices CHALLENGING TO GOVERN ML PROJECTS EFFICIENTLY Out-of-box ML governance tools Simplify access control and enhance transparency across ML lifecycle

rights reserved. Amazon SageMaker AI and Amazon Bedrock Use a model in SageMaker AI together with another model in Bedrock Fine-tune a model in SageMaker AI and import to Bedrock for further customization and other use cases Experiment using Bedrock, and move to production using SageMaker AI for control over cost, throughput, and latency Model development Build and customize Foundational Models using advanced techniques Configurable model deployment and inference Code-based IDE MLOps and FMOps Bedrock Application development Built-in tooling for customization with RAG Built-in tooling for agentic workflow Access to Claude, Amazon FMs, and 3P providers via API calls Responsible AI

rights reserved. What customers are asking… How do I choose between an existing model and building one? How do I optimize training and inference cost? How to improve accuracy while scaling model size? How can I deploy ML and Foundation Models at scale?

rights reserved. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. Build FMs from scratch Customize FMs Access the latest and publicly available FMs Manage and deploy models for inference Implement FMOps and governance Amazon SageMaker AI Build, train, and deploy ML models at scale, including FMs

rights reserved. Amazon SageMaker AI Build, train, and deploy ML models at scale, including FMs Build FMs from scratch Create your own ML models, including FMs, with integrated purpose-built tools and high- performance, cost-effective infrastructure Customize foundation models Access and evaluate 250+ FMs that can be customized easily for your use case Implement MLOps and governance Create reliable and repeatable workflows incorporating MLOps practices with purpose-built tooling Manage and deploy models for inference Easiest way to deploy AI & ML models including foundation models (FMs) to make inference requests at the best price performance for any use case Improve ML governance Enhance model governance and compliance with built-in governance tools

rights reserved. Machine Learning (Tabular Inputs) Deep Learning (Unstructured Inputs) Gen AI (Unstructured outputs) Predictive maintenance Financial risk prediction Demand forecasting Fraud detection Churn prediction Personalized recommendations Computer vision Meta data enrichment Sentiment analysis Topic modelling Intelligent data processing Autonomous driving Summarization Information extraction Visual content generation Code generation Audio/music generation Synthetic data generation SageMaker AI supports ML, DL and GenAI

rights reserved. C L A S S I C M L G E N A I Data Prep Prepare data assets and pipelines, Manage data quality and bias Build Experiment and automate the execution of build pipelines Train Train ML models at scale with automation Deploy Automate the execution of deploy pipelines into production Data Prep Pre-train or Select Select from FM hub or BYOFM, or Pre-train FM Evaluate Automated or human evaluation Fine-tune Fine-tune and/or optimize models for deployment Deploy Classic ML and GenAI with Amazon SageMaker AI

rights reserved. Hundreds of thousands of customers

rights reserved. Accelerate and scale data prep for AI and ML Use the most comprehensive set of tools for both structured and unstructured datasets Access data Easily access and query data from a wide variety of data sources Cleanse, label, enrich Create high quality labeled datasets for training models using your tool of choice or through human feedback Analyze and visualize Explore data through purpose-built analysis and visualization tools, or visualize geospatial data on interactive 3D accelerated maps Store and share Securely store, manage, and share features to be used the ML lifecycle Scale Efficiently process large amounts of data while reducing cost

rights reserved. Lakehouse Data & AI Governance Data processing SQL Analytics Model development Gen AI App development Streaming Search Business Intelligence C O M I N G S O O N C O M I N G S O O N C O M I N G S O O N Amazon SageMaker Unified Studio (preview) Amazon SageMaker Unified Studio Access all your data and tools for analytics and AI model development in a single environment

rights reserved. Integrated experience for data preparation, model building, and generative AI application development Unifying your tools such as notebooks and query editors across services Seamless integration with AWS data processing, analytics and ML services like EMR, Glue, Athena, Redshift, and Bedrock Amazon SageMaker Unified Studio (Public Preview) Amazon SageMaker Unified Studio Access all your data and tools for analytics and AI model development in a single environment

rights reserved. S A G E M A K E R S T U D I O J U P Y T E R L A B | C O D E E D I T O R – O S S V S C O D E | R S T U D I O Store features Build with preferred IDE Manage and monitor Train/ Fine-tune models Deploy in production Tune parameters Prepare data Evaluate and select pre-built model Integrated with AI powered developer tools, including Amazon Q Developer S E C U R I T Y | G O V E R N A N C E | A D M I N C O N T R O L S S A G E M A K E R C A N V A S L O W C O D E N O C O D E I D E Amazon SageMaker Unified Studio Tools for every step of the ML lifecycle under one unified visual user interface

Is Bigger always Better? No.

While models and data were growing exponentially, Lelapa were seeing
what we could do with less Introducing InkubaLM-0.4B - A Little LM trained from scratch using only 1.9 billion tokens of data for 5 African languages

rights reserved. Serving AI to a Continent

rights reserved. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. Deploy AI and ML models Fully managed deployment for inference at scale Built-in integration for MLOps ML workflows, CI/CD, feature management, lineage tracking, and model management Deploy models in production for inference for any use case From low latency and high throughput to long-running inference Shadow testing & automatic deployment guardrails Validate the performance of new ML models against production models. Minimize risk when deploying new model versions on SageMaker AI using linear, canary, or blue green traffic switching Cost-effective deployment Reduce inference cost by at least 50% with multi-model/multi- container endpoints, serverless inference, and elastic scaling Wide selection of infrastructure 70+ instance types with varying levels of compute and memory to meet the needs of every use case. Deliver up to 40% better inference price performance with Inf2 instances Large model inference container Achieve best price performance with the latest inference optimizations tools, model servers, and libraries packaged into a single container

rights reserved. Model deployment on Amazon SageMaker AI M O D E L C O N T A I N E R I N F R A S T R U C T U R E Amazon SageMaker AI Single model deployment Multi-adapter hosting Multi-model deployment Single container Multi-container Serverless GPUs CPUs Real-time synchronous response Near real-time asynchronous response Offline batch inference Invoke Response Invoke Response Submit Complete

rights reserved. Deploy models to serve inference Real-time inference Async inference Serverless inference Batch inference Multi-model endpoints Multi- container endpoints Inference pipelines Manage and version models MLOps Model monitoring Shadow Testing Metrics and logging in CloudWatch SageMaker JumpStart SageMaker Studio CPUs GPUs Inferentia Graviton (ARM) ML Compute Instances and Accelerators SageMaker Neo NVIDIA TensorRT/ cuDNN Intel oneDNN ARM Compute Library Deep learning compilers and runtimes Containers LMI BYOC

rights reserved. Broad and deep accelerated computing portfolio GPU, AWS ML Accelerators, And FPGA-based EC2 instances G4 GPUs G5 G6 G6e P4d P4de P5 G4 G4 G4 PREVIEW Inf1 AWS ML chips Inf2 Trn1 Trn2 Trn3 H200, H100, A100, L4, L40S, A10G, T4 Trainium accelerator Inferentia accelerator

rights reserved. Amazon SageMaker Deployment Modes Real-time HTTP requests Client Application Inference request Inference result Real-time HTTP requests Ah-hoc or scheduled job Persistent Endpoint Managed Instances Client Application Inference request Inference result SageMaker Endpoint Serverless Environment Client Application Notification Listener Inference request Request acknowledged SageMaker Endpoint Managed Queue System HTTP requests queued, handled asynchronously Client Application Job request: dataset, model, output path, instance types Request acknowledged SageMaker Batch Fully Managed Job Result Notification Notification Result Notification Listener Notification EventBridge event Ad serving, search, personalized recommendations, Generative AI Low latency, high throughput Supports multi-model endpoints Responses within milliseconds Max request payload: 6 MB Timeout: 60 sec Extract data from documents, form processing, chatbots, model dev/test Automatically scale to accommodate unpredictable traffic (scales to zero) Response times vary (warm/cold start) Max request payload: 4 MB Timeout: 60 sec Video processing, large image processing, decoupled applications and systems Ability to scale resources to zero Responses can be near real-time Processing times vary: queue size, worker status Inference input: Pointers to S3 objects (1 GB max) Ideal for models with long processing times (15min max) Business forecasting, propensity modeling, churn prediction, predictive maintenance Suitable for periodic arrival of large datasets Jobs can be long running Ideal for large datasets (Batch Transform allows for splitting of datasets across multiple instances) E X A M P L E U S E C A S E S A N D T E C H N I C A L C O N S I D E R A T I O N S

rights reserved. Ops challenges managing the model lifecycle Purpose Built Tools for MLOps and governance Manual iterative processes slow down ML innovation Difficult to scale and manage the number of models in production CI/CD for ML requires writing custom code Compliance requirements are difficult to meet

rights reserved. © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. Amazon SageMaker MLOps Streamline the ML lifecycle Automate ML workflows to scale model development Build CI/CD pipelines for gen AI and ML to improve reliability, quality, and accelerate model deployment Catalog model versions, metadata, metrics, and approvals for traceability and reusability Maintain accuracy of predictions after models are deployed Track lineage for traceability and compliance

rights reserved. Amazon SageMaker AI now supports fully-managed MLflow 3.0 36 End-to-end observability and experiment tracking Streamline AI development with unified monitoring from experimentation to production Advanced tracing capabilities for GenAI applications Record inputs, outputs, and metadata at every step to quickly identify bugs and unexpected behaviors Comprehensive version control and traceability Connect AI responses to source components for efficient troubleshooting and rapid issue resolution N E W

rights reserved. The Business Outcome

rights reserved. 38 Before Vulavula: ❌ Manual, human-led call review. Only 2–3% of calls are reviewed due to time and cost constraints — slow, expensive, inconsistent ❌ 30–40% of calls are untranscribable using off-the-shelf solutions — especially in African languages ❌ Locked into third-party cloud vendors with limited flexibility After Vulavula: ✅ Scalable call reviews — 50× more calls processed than human teams, at the same cost ✅ Automated transcription and translation — reliable even in low- resource, local languages 100% of calls processed ✅ Flexible deployment — cloud-independent and on-prem options (ideal for regulated sectors) Unlocks new revenue — telcos, banks, and analytics providers can sell or act on compliance, sales at scale, happy customers Missed revenue opportunities + major compliance risk poor customer satisfaction

rights reserved. The Business Outcome

rights reserved. Go-to site for builders to connect with the AWS Community. Builders Center Get started for free Claim your Builder ID

rights reserved. skillbuilder.aws Create a free account on AWS Skill Builder to gain in-demand skills

rights reserved. Thank you! Please complete the session survey in the mobile app Nicolas David [email protected] /in/nicolasdavid/ @nuage_ninja Jade Abbott

Training African language foundation models wit...

Training African language foundation models with Lelapa AI and AWS Machine Learning Services

More Decks by Nicolas DAVID

Featured

Transcript