Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hypertuning with Keras Tuner: Building Robust I...

Hypertuning with Keras Tuner: Building Robust Image Classifiers

Hypertuning with Keras Tuner: Building Robust Image Classifiers

This talk explores automated hyperparameter tuning using Keras Tuner to optimize a neural network for the Fashion MNIST dataset. I discuss model optimization strategies, experimental tracking, and evaluation with K-Fold Cross-Validation.

📌 Tools & Libraries:

TensorFlow 2.18.0

Keras Tuner 1.4.7

Seaborn, Matplotlib, Scikit-learn

📌 Project Outcomes:

Identified optimal units = 352 and learning rate = 0.001

Achieved ~88.6% test accuracy

Demonstrated consistency across folds (std ~0.0053)

Pinpointed areas for further model refinement

🧠 Focus:
Hyperparameter optimization, search algorithms, reproducible experimentation, and performance diagnostics for deep learning.

📘 Full notebook:
View on Kaggle

🔊 Presented at:
Deep Learning IndabaX Uganda 2025 🇺🇬
Part of the IndabaX Research Fellow Program

Avatar for Phillip Ssempeebwa

Phillip Ssempeebwa

June 12, 2025
Tweet

More Decks by Phillip Ssempeebwa

Other Decks in Programming

Transcript

  1. Project Goal & Workflow • The Challenge: Can we teach

    a machine to recognize clothing? • Build the best possible neural network to classify clothing images. • Workflow: • 1. Exploratory Data Analysis (EDA) • 2. Data Pipelines • 3. Hyperparameter Tuning • 4. Cross-Validation • 5. Final Training & Evaluation
  2. Dataset Overview: Fashion MNIST • 60,000 Training Images, 10,000 Test

    Images • Grayscale 28x28 pixel images of clothing items • 10 Classes: T-shirt/top, Trouser, Pullover, Dress, etc. • Balanced dataset: 6,000 examples per class. This is ideal and simplifies our modeling process significantly.
  3. Exploratory Data Analysis (EDA) • Visualize samples from dataset •

    Inspect class distribution • Confirm dataset is balanced • No major preprocessing needed for balance
  4. Data Preprocessing • Normalize pixel values to [0, 1] •

    Split 50K for training and 10K for validation (for tuning) • Build tf.data pipelines for performance
  5. Efficient Data Pipelines • Used TensorFlow’s `tf.data.Dataset` API • Shuffling,

    batching, and prefetching enabled • Scales well to training & validation usage
  6. Hyperparameter Tuning How do we choose the right settings? We

    don't guess— we search. The Problem: Our model has "dials" called hyperparameters. • How many neurons should be in a layer? • How fast should the model learn (learning rate)?
  7. Hyperparameter Tuning Ctd.. • The Solution: Automated Hypertuning • We

    define a search space for our model's architecture. • We use the Keras Tuner library to automatically build and test dozens of different models. • The tuner's goal: find the combination with the highest validation accuracy.
  8. Hyperparameter Tuning • How do we choose the right settings?

    We don't guess—we search. • Used Keras Tuner with Hyperband • Tuned: • - Number of Dense units (32–512) • - Learning rate (0.01, 0.001, 0.0001) • EarlyStopping to prevent overfitting
  9. The "Champion" Architecture • The search is complete. We have

    a winner! Optimal Hyperparameters Found: • Number of Units: 448 • Learning Rate: 0.001 • Conclusion: Keras Tuner determined that a model with a single hidden layer of 448 neurons and a learning rate of 0.001 performed the best out of all the combinations it tested.
  10. Is the Result a Fluke? Cross- Validation • How can

    we be sure our "champion" model is truly stable? • The Problem: A single validation set might have been "lucky". We need a more robust test.
  11. Is the Result a Fluke? Cross- Validation • The Solution:

    5-Fold Cross-Validation • We split our entire training set into 5 "folds". • We trained 5 separate models on different combinations of these folds. • We collected the accuracy score from each run. Result: We expect the accuracy to be very consistent across all 5 folds, proving our model's performance is stable and reliable.
  12. Process & Results: • Training: We trained our champion model

    on all 60,000 training images. • Final Exam: We evaluated it on the 10,000 test images it had never seen before. • Key Insight from Confusion Matrix: The model is highly accurate but struggles most with "Shirt", which it often confuses with "T-shirt/top" and "Coat". This gives us a clear direction for future improvements.
  13. Analysis of individual predictions. • Visual Analysis: • Green (Correct):

    The model correctly identifies most items, from ankle boots to trousers. • Red (Incorrect): We can see an example of the model mistaking a "Shirt" for a "Coat", confirming our analysis from the confusion matrix.
  14. Conclusion & Next Steps • Achieved strong results with minimal

    architecture • Validated stability with cross-validation • Next: We’ll try CNN, regularization, data augmentation