Slide 1

Slide 1 text

Introduction to Git & Github for Data practitioners Gift Ojeabulu

Slide 2

Slide 2 text

About me - Sport Data Scientist & Software Developer at CBB Analytics. - Technical Writer at Towards AI

Slide 3

Slide 3 text

CONTENT README OVERVIEW OF KEY IDEAS What is git,github & Open-source Why Git & Github for Data practitioners. Thank you

Slide 4

Slide 4 text

What is Git, Github & Open-source Presenter: Gift Ojeabulu

Slide 5

Slide 5 text

Git is software for tracking changes in any set of files used for co-ordinating work among programmers collaboratively developing source-code. Github is commonly use to host open-source projects.

Slide 6

Slide 6 text

Why Git & Github for Data practitioners? Presenter: Gift Ojeabulu

Slide 7

Slide 7 text

Data Scientist simply use jupyter notebook, However notebook are probably good enough for research and exploration. When models get into production tools like Git, DVC, DAGsHub are better option for reproducibility & Experiment tracking.

Slide 8

Slide 8 text

Over 70M Developers GITHUB NOVEMBER 2O21 DEVELOPER SURVEY Research shows that Github has over 73million developers as at November 2021 and more than 200 million repositories including at least 28 million public repositories.it is the largest source code host as of November 2021

Slide 9

Slide 9 text

Modern Github Readme Profile Customization Presenter: Gift Ojeabulu

Slide 10

Slide 10 text

What is DVC & DAGsHub ? Presenter: Gift Ojeabulu

Slide 11

Slide 11 text

DVC is to data practitioners & machine learning engineers what Git is to software developers. Github is to software developers what DAGsHub is to machine learning engineers.

Slide 12

Slide 12 text

Great discoveries & improvements invariably involve the cooperation of many minds. Alexander G. Bell

Slide 13

Slide 13 text

How can beginners contribute to open source. 1. Github first issues contribution 2. Writing use-cases & articles 3. Giving talks & webinars on how to use a data science library

Slide 14

Slide 14 text

Quick check on some core git keywords? + A repository is collection of source code. + git commit is a command used to add all files that are staged to the local repository. + git add is a command used to add a file that is in the working directory to the staging area. + git push is a command used to add all committed files in the local repository to the remote repository. So in the remote repository, all files and changes will be visible to anyone with access to the remote repository. + git pull is command used to get files from the remote repository directly into the working directory. + git merge is a command used to get the files from the local repository into the working directory.

Slide 15

Slide 15 text

Thank you for your time. @GiftOjeabulu_ Linkedln: gift ojeabulu Medium: @giftojeabulu