Slide 1

Slide 1 text

Building and Enlightening Data Professionals in Africa. An annual conference for all data practitioners in Africa. #DataFestAfrica22 #DFA22

Slide 2

Slide 2 text

Who am I? Gift Ojeabulu Twitter: @GiftOjeabulu_ - Co-founder and community lead at Data Community Africa/DatafestAfrica. - Organizer of MLOps Community Lagos meetup. - AWS ML Community Builder, Global AI Hub ML thought leader, technical writer and public speaker. - Podcast Host at Datapodchat. - Founder & facilitator of the African Data Community Newsletter with over 1.61k subscribers. - Technical documentation and content lead for slik- wrangler.

Slide 3

Slide 3 text

Introductory guide to simplifying MLOps process with DVC, CML and MLEM

Slide 4

Slide 4 text

What is MLOps? MLOps or ML Ops is a set of practices that aims to deploy and maintain machine learning models in production reliably and efficiently. The word is a compound of "machine learning" and the continuous development practice of DevOps in the software field. Machine learning models are tested and developed in isolated experimental systems. When an algorithm is ready to be launched, MLOps is practiced between Data Scientists, DevOps, and Machine Learning engineers to transition the algorithm to production systems

Slide 5

Slide 5 text

Why MLOps? MLOps is a set of practices for collaboration and communication between data scientists and operations professionals. Applying these practices increases the quality, simplifies the management process, and automates the deployment of Machine Learning and Deep Learning models in large-scale production environments.

Slide 6

Slide 6 text

DVC

Slide 7

Slide 7 text

DVC is to Machine learning Engineers what Git is to Software Engineers

Slide 8

Slide 8 text

About DVC Data Version Control is a data versioning, ML workflow automation, and experiment management tool that takes advantage of the existing software engineering toolset you're already familiar with (Git, your IDE, CI/CD, etc.). DVC helps data science and machine learning teams manage large datasets, make projects reproducible, and better collaborate.

Slide 9

Slide 9 text

Why DVC Even with all the success we've seen today in machine learning, especially with deep learning and its applications in business, data scientists still lack best practices for organizing their projects and collaborating effectively. This is a critical challenge: while ML algorithms and methods are no longer tribal knowledge, they are still difficult to implement, reuse, and manage.

Slide 10

Slide 10 text

Use case If you store and process data files or datasets to produce other data or machine learning models, and you want to ● track and save data and machine learning models the same way you capture code; ● create and switch between versions of data and ML models easily; ● understand how datasets and ML artifacts were built in the first place; ● compare model metrics among experiments; ● adopt engineering tools and best practices in data science projects; DVC is for you!

Slide 11

Slide 11 text

CML

Slide 12

Slide 12 text

CML is to Machine learning Engineers what Github actions is to Software Engineers

Slide 13

Slide 13 text

About CML Continuous Machine Learning (CML) is an open-source library for implementing continuous integration & delivery (CI/CD) in machine learning projects. Use it to automate parts of your development workflow, including model training and evaluation, comparing ML experiments across your project history, and monitoring changing datasets.

Slide 14

Slide 14 text

MLEM

Slide 15

Slide 15 text

About MLEM MLEM is a tool to easily package, deploy and serve Machine Learning models. It seamlessly supports a variety of scenarios like real-time serving and batch processing.

Slide 16

Slide 16 text

Use case for MLEM If you train Machine Learning models and you want to ● save machine learning models along with all meta-information that is required to run them; ● build your models into ready-to-use format like Python packages or Docker Images; ● deploy your models, easily switching between different providers when you need to; ● adopt engineering tools and best practices in data science projects; MLEM is for you!

Slide 17

Slide 17 text

Iterative Game

Slide 18

Slide 18 text

Play DeeVee’s Ramen Run! 1st Prize 75,000 Chimoney 2nd Prize 50,000 Chimoney 3rd Prize 25,000 Chimoney SCAN ME!

Slide 19

Slide 19 text

MLOps Community Lagos meetup

Slide 20

Slide 20 text

Thank you Sponsors

Slide 21

Slide 21 text

THANK YOU! Gift Ojeabulu Co-founder & community lead at DatafestAfrica Twitter: @GiftOjeabulu_