Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Validating your machine learning models with Mi...

Validating your machine learning models with Minimal Effort using DeepChecks

A Talk at MLOps: Machine Learning in production New York City

https://mlopsworld.com/newyork/

What is Deepchecks?
When should I use Deepchecks?
Deepchecks validation structure.
When should I validate?
Use case.
Conclusion.

GiftOjeabulu

March 30, 2022
Tweet

More Decks by GiftOjeabulu

Other Decks in Programming

Transcript

  1. Who am I? - Data Scientist. - Founder of The

    African Data Community Newsletter. - Technical Writer, Public Speaker, Social Media Content creator, Machine Learning Thought Leader (Global AI Hub), Open- source & Community Advocate. Gift Ojeabulu CBBAnalytics @GiftOjeabulu_
  2. - What is Deepchecks? - When should I use Deepchecks?

    - Deepchecks validation structure. - When should I validate? - Use case. - Conclusion. Learning Objective
  3. Validating the accuracy, clarity, and details of data is necessary

    to mitigate any project defects. Without validating data, you risk basing decisions on data with imperfections that are not accurately representative of the situation at hand.
  4. Deep Checks is the leading tool for validating your machine

    learning models and data, and it enables doing so with minimal effort. Deep Checks accompany you through various validation needs such as: - Verifying your Data Integrity. - Inspect its distribution. - Validating data splits. - Evaluating your model & comparing different models.
  5. Typical Validation Scenarios 1. When you start working with a

    new dataset: Validate New Data. 2. When you split the data (before training / various cross-validation split / hold-out test set/ …): Validate the Split. 3. When you evaluate a model: Validate Model Performance.
  6. 12 Types of Issues – What Can Go Wrong? Data

    Integrity Methodological Flaws Model Performance Fairness & Bias Data Distribution 9
  7. Validation Structure Test Suites Check s Built-in or Custom Display

    and Result 13 Condition s Pass / Fail / Warning
  8. Validation with Deepchecks Checks Condition s Pass / Fail /

    Warning Test Suites https://github.com/deepchecks/deepchecs
  9. Check Each check enables inspecting a specific aspect of your

    data and models, such as data drift, duplicate values, etc. Each check can have 2 types of results: - A visual result meant for display (e.g. a figure or a table) - A return value that can be used for validating the expected check results
  10. Condition A condition is a function that can be added

    to a Check, which returns a pass ✓, fail ✖ or warning ! result, intended for validating the Check’s return value.
  11. Suite A suite is an ordered collection of checks that

    can have conditions added to them. The Suite enables displaying a concluding report for all of the Checks that ran.
  12. How to install Deepchecks Deep Checks requires Python 3 and

    can be installed using pip or conda, depending on the package manager you’re working with for most of your packages. - Using Pip - PIp Install Deepchecks - Using Conda - conda install -c conda-forge deepchecks - Using Google Colab or Kaggle Kernel - !pip install deepchecks --user
  13. In order to run your first Deepchecks Suite all you

    need to have is the data and model that you wish to validate. More specifically, you need: • Your train and test data (in Pandas DataFrames or Numpy Arrays) • (optional) A supported model (including XGBoost, scikit-learn models, and many more). Required for running checks that need the model’s predictions for running.
  14. Deepchecks is fantastic and will only serve to make machine

    learning model valuation an easier experience. Running all of those tests by hand would take hundreds of lines of code.