AutoGluon: State-of-the-Art Automated Machine Learning (English)

AutoGluon: State-of-the-Art Automated Machine Learning From 3 Lines of Code
to Production-Grade Models A Deep Dive by Data Scientists and Machine Learning Engineers 1

Today's Agenda 1. Introduction: Overview, Strengths, and Core Philosophy of
AutoGluon 2. AutoGluon's Architecture: How It Works (Stacking and Bagging) 3. Mastering Tabular Prediction: A Deep Dive into TabularPredictor 4. Beyond Tables: Applications in Multimodal and Time Series 5. Advanced Topics & Deployment: Feature Engineering and Cloud Deployment 6. Summary and Best Practices 2

1. What is AutoGluon? The Philosophy Behind "3 Lines of
Code" AutoGluon is an open-source library that automates machine learning tasks, enabling users to train and deploy high-accuracy models with minimal code. Its philosophy is distilled into its iconic "3 lines of code": from autogluon.tabular import TabularPredictor # 1. Initialize predictor and fit on data predictor = TabularPredictor(label="class").fit("train.csv") # 2. Predict on test data predictions = predictor.predict("test.csv") # 3. (Optional) Evaluate model performance predictor.leaderboard("test.csv") This simplicity represents AutoGluon's core design principle: abstracting away immense complexity into a simple, intuitive API. 3

2. Why AutoGluon? Dominant Performance in AutoML Benchmarks AutoGluon's superiority
is objectively demonstrated by independent, third-party benchmarks. OpenML AutoML Benchmark 2023 Results (Excerpt) Method Win Rate vs. AutoGluon 1.0 Loss Improvement by AutoGluon 1.0 Avg. Rank Top-1 Rate AutoGluon 1.0 - - 1.95 63% lightautoml 84% 12.0% 4.78 12% H2OAutoML 94% 10.8% 4.98 1% FLAML 86% 16.7% 5.33 2% Auto-sklearn 2 82% 15.1% 5.58 5% XGBoost 100% 22.9% 8.86 2% RandomForest 97% 25.1% 9.78 1% 4

Why AutoGluon? (Continued) The benchmark results show that AutoGluon's performance
is consistently dominant across various tasks and competitors. Landslide Victory Over Traditional Models: Boasts a win rate of over 99% against powerful standalone models like LightGBM and XGBoost. Outperforms Other AutoML Systems: Records a high win rate of over 80% against other major AutoML frameworks like lightautoml and H2OAutoML. This objective data provides strong evidence to back the claim that AutoGluon is "state-of-the- art." 5

3. Core Philosophy: Ensembling over HPO A key design principle
that sets AutoGluon apart from many other AutoML tools is its core philosophy. Traditional AutoML (CASH): Focuses on Combined Algorithm Selection and Hyperparameter optimization (CASH), searching a vast space to find the optimal "single model" and "its hyperparameters." AutoGluon's Approach: Achieves success by ensembling multiple models and stacking them in multiple layers. The belief is that wisely combining "many good models" yields better results within a given time limit than searching for a "single perfect model." This design philosophy is the root of AutoGluon's speed, robustness, and high accuracy. 6

4. Supported Tasks and Predictor APIs AutoGluon provides specialized Predictor
classes for major machine learning tasks. Data Type Task Main Predictor API Tabular Data Classification/Regression TabularPredictor Multimodal Image, Text, and Table combinations MultiModalPredictor Time Series Data Forecasting future values TimeSeriesPredictor Image Data Image Classification, Object Detection ImagePredictor , ObjectDetector In recent versions, tasks related to text and images are increasingly being integrated into MultiModalPredictor , aiming for a single, powerful, and flexible API. 7

5. Installation and Setup Getting started with AutoGluon is extremely
simple. (Supports Python 3.9 - 3.12) Basic Installation pip install autogluon Full Installation To install all functionalities (tabular, multimodal, time series) at once: pip install "autogluon[all]" 8

Installation and Setup (Continued) GPU Support (Highly Recommended) A GPU
is essential for accelerating MultiModalPredictor and deep learning models. 1. Install NVIDIA Drivers and CUDA Toolkit Install the appropriate drivers and CUDA Toolkit for your NVIDIA GPU. 2. Install CUDA-enabled PyTorch (Example: CUDA 12.1) pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 3. Install AutoGluon pip install autogluon 9

6. The Expanding AutoGluon Ecosystem AutoGluon is evolving from a
single library into a foundational platform for a suite of tools automating advanced ML tasks. autogluon-cloud : Automates model training and deployment on AWS SageMaker. autogluon-rag : Builds domain-specific question-answering systems (RAG) in 3 lines of code. autogluon-fair : A post-processing toolkit to ensure fairness in model predictions. autogluon-assistant : A multi-agent system using LLMs to automate end-to-end workflows from raw data to ML solutions. This expanding ecosystem shows AutoGluon's long-term vision and its commitment to tackling cutting-edge challenges. 10

AutoGluon's Architecture How is its exceptional performance achieved? 11

AutoGluon's Architecture: The Three Pillars AutoGluon's incredible performance is built
on three key principles: 1. Training Diverse Models Trains various model families with different characteristics, such as gradient boosted trees and neural networks. 2. Bagging Trains each model multiple times on different subsets (subsamples) of the data to improve model stability. 3. Stack Ensembling Uses the predictions of trained models as "features" for a new set of models, stacking them in multiple layers to maximize predictive power. 12

A Deeper Look: Bagging with Cross-Validation AutoGluon's bagging cleverly combines
with cross-validation to achieve two important goals simultaneously. Process (e.g., 8-fold CV) 1. Partitioning: The training data is split into 8 disjoint subsets (folds). 2. Model Training: For each fold k , a model is trained using the other 7 folds. 3. Result: This process generates 8 instances of the same model architecture. This set of 8 models forms the "bagged model." 13

Bagging with Cross-Validation (Continued) The brilliance of this method lies
in its by-product. Variance Reduction (The primary purpose of bagging): By averaging the predictions of multiple models, the final prediction is more stable and less prone to overfitting than a single model's prediction. Generation of Out-of-Fold (OOF) Predictions: For each data point, we get a "clean" prediction made by a model that was not trained on that data point. These OOF predictions are the key to making the next step, stack ensembling, work robustly. 14

The Most Important Concept: Out-of-Fold (OOF) Predictions What are Out-of-Fold
(OOF) predictions? For each data point in the training set, it is the prediction made by a model that did not see that data point during its training. Why are they crucial? They are essential to prevent data leakage. If a stacker model were to train on the in-sample predictions of a base model (i.e., predictions on data the base model has already seen), the stacker would learn to over-trust these predictions, leading to severe overfitting and poor performance on unseen data. OOF predictions simulate how a model behaves on unseen data, creating a reliable and valid feature set for the next layer of models. 15

A Deeper Look: Multi- Layer Stack Ensembling Stack ensembling is
a technique that uses the predictions of models as new features, stacking them in a hierarchical structure. Layer 1 (Base Layer): Multiple bagged models (LightGBM, CatBoost, NN, etc.) are trained using only the original features. These models generate OOF predictions. Layer 2 (Stacker Layer): 16

Multi-Layer Stack Ensembling (Continued) AutoGluon's Secret Sauce: Skip Connections Models
in each layer receive "original features + OOF predictions from all lower layers" as input. Why is this powerful? The higher-layer models learn how to combine the predictions of the lower-layer models. At the same time, they can directly capture patterns in the original data that all lower- layer models might have missed. This prevents information loss and maximizes predictive power. 17

The Final Layer: Weighted Ensemble The final layer of the
stack contains a single special model named WeightedEnsemble_L<X> . Role: This model does not use features. It takes the predictions from the stacker models in the layer below and learns an optimal weighted average of them. Algorithm: It uses a greedy algorithm to efficiently find the optimal weights that maximize the validation score. This final step, a form of "blending," fine-tunes the combination of the most powerful stacker models to squeeze out the last drop of performance. 18

Decoding leaderboard() : Model Naming Convention The model names in
the leaderboard() output reflect AutoGluon's architecture. Suffix Meaning Example _BAG A bagged ensemble model LightGBM_BAG_L1 _L<x> A model trained at stack level x (L1 is base) CatBoost_BAG_L2 _FULL A model retrained on all data (for fast inference) LightGBM_BAG_L1_FULL _DSTL A lightweight model created via model distillation LightGBM_DSTL /T<x> A model from HPO (Hyperparameter Opt.) trial x LightGBM/T8 Example: CatBoost_BAG_L2 is instantly recognizable as a "bagged CatBoost ensemble model trained at the second layer of the stack." 19

Architecture Summary The AutoGluon training process proceeds as follows: Data
→ Base Models (L1) → OOF Predictions → Stacker Models (L2) → ... → Weighted Ensemble The total number of models trained is approximately M x N x K + 1: M: Number of stack layers N: Number of model types per layer K: Number of bagging folds +1: The final weighted ensemble model This robust, multi-layered approach is why AutoGluon achieves performance that surpasses other AutoML tools. 20

Mastering Tabular Prediction A Deep Dive into TabularPredictor 21

What is TabularPredictor ? TabularPredictor is the primary tool for
handling structured data, such as CSV files, Parquet files, and pandas DataFrames. Automatic Task Detection: It automatically identifies the task (binary classification, multiclass classification, or regression) based on the data type of the label column. End-to-End Automation: It automatically manages the entire machine learning pipeline, from data preprocessing and feature engineering to model training and ensembling. Now, let's explore how to use TabularPredictor in detail. 22

Quick Start: Classification Task Example (1/3) Step 1: Data Preparation
We'll use the Adult Census dataset to predict whether an individual's income exceeds $50K. from autogluon.tabular import TabularDataset, TabularPredictor # Load training data from an S3 bucket # TabularDataset is a subclass of pandas DataFrame train_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv') # Specify the name of the label column label = 'class' # Display the first 5 rows of the data train_data.head() 23

Quick Start: Classification Task Example (2/3) Step 2: Model Training
( fit ) Simply initialize TabularPredictor and call the fit() method to start training. # Initialize the Predictor specifying the label, then run fit predictor = TabularPredictor(label=label).fit(train_data=train_data) During the fit() execution, you will see logs indicating the progress: The path where AutoGluon saves the models The inferred problem type ( 'binary' ) Information on the train/validation data split The training status and validation score of each model 24

Quick Start: Classification Task Example (3/3) Step 3: Prediction and
Evaluation Use the trained predictor to make predictions on unseen test data and evaluate its performance. # Load the test data test_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv') # Make predictions on the test data (dropping the label column) predictions = predictor.predict(test_data.drop(columns=[label])) print(predictions.head()) # > 0 <=50K # > 1 <=50K # > 2 >50K # > ... # Evaluate the model's performance on the entire test dataset performance = predictor.evaluate(test_data) print(performance) # > {'accuracy': 0.863, 'balanced_accuracy': 0.792, ...} 25

Comparing Models with leaderboard() leaderboard() is the most important method
for reviewing the performance of all trained models. # Also display performance on the test data predictor.leaderboard(test_data, silent=True) Example Leaderboard Output model score_test score_val pred_time_test fit_time WeightedEnsemble_L2 0.874 0.852 2.45 150.3 CatBoost_BAG_L1 0.871 0.849 0.12 25.7 LightGBM_BAG_L1 0.869 0.845 0.25 18.9 XGBoost_BAG_L1 0.868 0.843 0.31 22.1 ... ... ... ... ... 26

How to Read the leaderboard() Each column provides crucial information
for evaluating models from multiple perspectives. Column Name Description model The name of the model. The architecture can be inferred from the naming. score_test Evaluation score on the test data. Indicates the model's final generalization performance. score_val Evaluation score on the validation data. Used for model selection and tuning. pred_time_test Prediction time (in seconds) for the entire test dataset. An indicator of inference speed. fit_time The model's training time (in seconds). An indicator of training cost. Usually, the best-performing model is the `WeightedEnsemble` model. The leaderboard allows 27

Controlling Training: time_limit and presets To achieve the best possible
accuracy, you can control the training process with arguments in fit() . time_limit : Specifies the maximum training time in seconds. Generally, a longer time allows more models to be trained and tuned, leading to better accuracy. presets : A collection of settings to easily control the trade-off between accuracy and training/inference speed. Choosing the right preset for your use case is key to mastering AutoGluon. 28

Guide to Selecting presets Preset Name Use Case & Description
'best_quality' Maximizes accuracy. Aims for the highest performance, sparing no computational resources. Ideal for competitions like Kaggle. 'high_quality' Balances high accuracy with fast inference speed. Suitable for production environments. 'good_quality' Good accuracy with very fast inference speed. Useful when inference speed is a priority. 'medium_quality' Fast training time. Ideal for initial prototyping and identifying issues with the data. (Default) 'experimental' Added in v1.2. Uses pre-trained foundation models and parallel training. Pursues maximum accuracy, disregarding inference speed. (No GPU support). 29

Example fit Call for High Accuracy For Kaggle competitions or
projects where accuracy is the top priority, calling fit like this is recommended: # Specify the evaluation metric as ROC AUC eval_metric = 'roc_auc' # Set a training time limit of 1 hour time_limit = 3600 predictor_best = TabularPredictor( label=label, eval_metric=eval_metric ).fit( train_data=train_data, time_limit=time_limit, presets='best_quality' # Select the best quality preset ) predictor_best.leaderboard(test_data, silent=True) With this setting, AutoGluon will use the available time to its fullest, employing multi-layer 30

Selecting the Right eval_metric AutoGluon optimizes models to maximize the
evaluation metric specified in eval_metric during the fit call. Why is this important? The default metric (e.g., 'accuracy' for classification) may not always align with business goals. For example, with imbalanced data, 'accuracy' can give a misleadingly high evaluation of a model's performance. Task Common Evaluation Metric Examples Classification 'accuracy' , 'balanced_accuracy' , 'roc_auc' , 'f1' Regression 'root_mean_squared_error' , 'mean_absolute_error' , 'r2' Selecting the eval_metric that best aligns with the business problem is crucial for building a practical model. 31

Regression Task Example TabularPredictor handles regression tasks seamlessly. The code
is almost identical to that of a classification task. If the label column contains continuous values (e.g., age ), AutoGluon will automatically infer the problem type as 'regression' . # Prepare data for regression (using the same data for demonstration) regression_train_data = train_data regression_label = 'age' # Change the label to 'age' (a numeric column) # Specify R-squared as the evaluation metric regression_predictor = TabularPredictor( label=regression_label, eval_metric='r2' ).fit( train_data=regression_train_data, time_limit=600 # 10 minutes ) regression_predictor.leaderboard(silent=True) 32

Advanced Usage: Customizing Hyperparameters While presets are convenient, experts may
want to finely control specific models or hyperparameters. This can be achieved with the hyperparameters argument. How to Specify: Use a dictionary format with model names as keys. Value Types: Fixed Value: 'learning_rate': 0.05 Search Space: Use autogluon.core.space ( ag.space ) to define a range. ag.space.Int(lower, upper) : Integer range ag.space.Real(lower, upper) : Real number range ag.space.Categorical(values) : List of categorical values 33

Hyperparameter Customization: Code Example This example customizes LightGBM ( GBM
) and a Neural Network ( NN_TORCH ), and excludes RandomForest ( RF ) from training. import autogluon.core as ag custom_hyperparameters = { 'GBM': { # LightGBM model 'num_leaves': ag.space.Int(lower=26, upper=66, default=36), 'learning_rate': 0.05 }, 'NN_TORCH': {}, # Use default settings for Neural Network 'CAT': {}, # Use default settings for CatBoost # 'RF' is not included, so it won't be trained } hyperparameter_tune_kwargs = { 'num_trials': 10, 'searcher': 'auto', 'scheduler': 'local' } predictor_custom = TabularPredictor(label=label).fit( train_data, hyperparameters=custom_hyperparameters, hyperparameter_tune_kwargs=hyperparameter_tune_kwargs, time_limit=300 34

Saving, Loading, and Preparing Models for Deployment It's important to
persist trained models for reuse and deployment. Saving and Loading a Model The entire predictor object (including preprocessing pipeline, all models, and ensemble logic) is saved as a folder at the specified path. # Save the model to the 'ag_models/' directory predictor.save('ag_models/predictor.ag') # Load the saved model in a new session predictor_loaded = TabularPredictor.load('ag_models/predictor.ag') # Make predictions with the loaded model new_predictions = predictor_loaded.predict(test_data) 35

Optimizing for Deployment Methods to improve inference speed and performance.
refit_full() : Retrains the best-performing model from the leaderboard on the entire dataset (train + validation). May slightly improve performance. Significantly improves inference speed as it doesn't perform bagging. predictor.refit_full() distill() : Uses model distillation to condense the knowledge from a large ensemble model into a smaller, faster single model. Effective when you need to dramatically improve inference speed at the cost of a slight drop in accuracy. distilled_predictor = predictor.distill(time_limit=300) 36

Exploring Model Rationale with feature_importance() Understanding which features were important
for a model's predictions is essential for interpretability and debugging. AutoGluon calculates feature importance using Permutation Importance, a robust method that measures how much a model's performance decreases when the values of a specific feature are randomly shuffled. # Calculate feature importance using the test data feature_importance = predictor.feature_importance(test_data) print(feature_importance) Example Output feature importance age 0.085 capital-gain 0.062 37

Interpreting feature_importance() Positive Importance: A higher value indicates that the
feature is more important for the model's predictions. Negative Importance: This suggests that the feature may be harmful to the model. Performance might improve if the model is retrained after excluding this feature. This feature is a powerful diagnostic tool for checking if the model is relying on noisy or irrelevant features. 38

Tabular Prediction Summary TabularPredictor provides a powerful yet simple interface
for structured data. fit() , predict() , evaluate() , and leaderboard() form the basic workflow. time_limit and presets are the primary means of controlling the trade-off between accuracy and resources. Advanced usage includes hyperparameters customization and deployment optimization with refit_full() and distill() . feature_importance() enhances model interpretability. 39

Beyond Tables Multimodal and Time Series Prediction 40

AutoGluon's True Power: Handling Diverse Data Formats MultiModalPredictor : A
unified API for complex tasks involving combinations of text, images, and tabular data. TimeSeriesPredictor : A dedicated API for forecasting future values from time series data. Next, we will explore how to use these powerful Predictors. 41

Image Classification (with MultiModalPredictor ) MultiModalPredictor makes image classification tasks
easy by leveraging powerful pre-trained models like those from the TIMM library. Step 1: Data Preparation If images are organized into folders named by class, they can be easily loaded with ImageDataset.from_folders . from autogluon.vision import ImageDataset url = 'https://autogluon.s3.amazonaws.com/datasets/shopee-iet.zip' # Load training and test data train_data, _, test_data = ImageDataset.from_folders(url) print(train_data.head()) # > image label # > 0 /tmp/shopee-iet/train/00/000a6c313cf585e50935... 0 # > 1 /tmp/shopee-iet/train/00/0013b2d13de788a29a3a... 0 42

Image Classification (Continued) Step 2: Training Initialize MultiModalPredictor and call
fit() . GPU usage is highly recommended. from autogluon.multimodal import MultiModalPredictor # Initialize the Predictor predictor = MultiModalPredictor( label='label', problem_type='image_classification' ) # Run training (GPU recommended) predictor.fit(train_data, time_limit=600) 43

Image Classification (Continued) Step 3: Prediction and Feature Extraction Prediction:
Predict the class of a new image with the trained model. image_path = test_data.iloc[0]['image'] prediction = predictor.predict(image_path) probabilities = predictor.predict_proba(image_path) Feature Extraction: Extract high-dimensional feature vectors from images. This is very useful for downstream tasks like similar image search. image_features = predictor.extract_embedding(test_data) print(image_features.shape) # > (12186, 768) 44

Text Prediction (with MultiModalPredictor ) MultiModalPredictor builds state-of-the-art text prediction
models by leveraging foundation models like Hugging Face Transformers. Use Case 1: Sentiment Analysis (Classification) import pandas as pd from autogluon.multimodal import MultiModalPredictor train_df = pd.DataFrame({ 'sentence': ["it's a charming journey", "It's slow, very slow."], 'label': [1, 0] }) predictor_sentiment = MultiModalPredictor(label='label').fit(train_df) predictor_sentiment.predict({'sentence': ["what a wonderful movie!"]}) # > 1 45

Text Prediction (Continued) Use Case 2: Sentence Similarity (Regression) Predict
a score for how semantically similar two sentences are. train_df_similarity = pd.DataFrame({ 'sentence1': ["A plane is taking off.", "A man is playing a large flute."], 'sentence2': ["An air plane is taking off.", "A man is playing a flute."], 'score': [5.00, 3.80] # Similarity score from 0 to 5 }) # Problem is auto-inferred as regression because the label is float predictor_similarity = MultiModalPredictor(label='score').fit(train_df_similarity) predictor_similarity.predict({ 'sentence1': ["A woman is playing the piano."], 'sentence2': ["A man is playing the guitar."] }) # > 2.35 The flexibility of the same API handling both classification and regression is a key advantage. 46

Time Series Forecasting ( TimeSeriesPredictor ) TimeSeriesPredictor is a dedicated
tool for forecasting future values from historical data. Step 1: Data Preparation ( TimeSeriesDataFrame ) Time series data must be converted into a special TimeSeriesDataFrame format, which requires three columns: item_id : A unique identifier for each time series. timestamp : The timestamp. target : The target value you want to forecast. from autogluon.timeseries import TimeSeriesDataFrame # Create a TimeSeriesDataFrame from a pandas DataFrame # df must have item_id, timestamp, and target columns # data = TimeSeriesDataFrame.from_data_frame( # df, # id_column="item_id", # timestamp_column="timestamp" # ) 47

Time Series Forecasting (Continued) Step 2: Training ( fit )
The most important argument for the fit method is prediction_length , which specifies how many steps into the future to forecast. from autogluon.timeseries import TimeSeriesPredictor # Train a model to forecast 48 steps into the future prediction_length = 48 predictor_ts = TimeSeriesPredictor( prediction_length=prediction_length, path="autogluon-ts-model", target="target", eval_metric="sMAPE" ) # predictor_ts.fit(train_data, presets="medium_quality") AutoGluon automatically trains and ensembles a diverse set of time series models, from statistical models like ARIMA to modern foundation models like Chronos. 48

Time Series Forecasting (Continued) Step 3: Prediction and Evaluation Prediction:
predict() generates forecasts for prediction_length steps immediately following the training data. The output includes not just the median ( mean ) but also prediction intervals ( 0.1 , 0.9 , etc.). # predictions = predictor_ts.predict(train_data) # print(predictions) Evaluation: leaderboard() can be used to evaluate the performance of each model on a hold- out test set. # leaderboard = predictor_ts.leaderboard(test_data) # print(leaderboard) 49

The Ultimate Challenge: True Multimodal Prediction AutoGluon's true power is
demonstrated in its ability to build predictive models that combine tabular, text, and image data. We'll illustrate this using the PetFinder dataset (predicting pet adoption speed). What to learn from this example: The user's main job becomes data preparation and metadata definition, not model building. By correctly describing the data, you can unlock the full potential of a highly complex and powerful automation pipeline. 50

Multimodal Prediction (1/4): Data Preparation Step 1: Loading and Preprocessing
Data First, load the CSV data. The dataset contains tabular data (age, breed), text descriptions, and paths to image files. import pandas as pd import os # Load the data (requires prior download and extraction) dataset_path = './ag_petfinder_tutorial/petfinder_processed' train_data = pd.read_csv(f'{dataset_path}/train.csv', index_col=0) # Process the image column to convert to absolute paths image_col = 'Images' def path_expander(path, base_folder): # Split multiple image paths and convert to absolute path return os.path.abspath(os.path.join(base_folder, path.split(';')[0])) train_data[image_col] = train_data[image_col].apply( lambda ele: path_expander(ele, base_folder=dataset_path) ) 51

Multimodal Prediction (2/4): Defining Metadata Step 2: Defining FeatureMetadata This
is the most crucial step. We explicitly tell AutoGluon which columns contain which type of data (especially images and text). from autogluon.tabular import FeatureMetadata # First, infer basic metadata from the DataFrame feature_metadata = FeatureMetadata.from_df(train_data) # Next, specify the special types (image path and text) image_col = 'Images' text_col = 'Description' feature_metadata = feature_metadata.add_special_types({ image_col: ['image_path'], text_col: ['text'] }) print(feature_metadata) This allows AutoGluon to internally use appropriate models for each column (e.g., ResNet, BERT). 52

Multimodal Prediction (3/4): Hyperparameters and Training Step 3: Specifying Hyperparameters
To handle multimodal data, we use the 'multimodal' preset, which includes configurations for image and text models. from autogluon.tabular.configs.hyperparameter_configs import get_hyperparameter_config hyperparameters = get_hyperparameter_config('multimodal') Step 4: Running fit Pass the prepared data, custom metadata, and hyperparameters to TabularPredictor.fit . from autogluon.tabular import TabularPredictor label = 'AdoptionSpeed' predictor_mm = TabularPredictor(label=label).fit( train_data=train_data, hyperparameters=hyperparameters, feature_metadata=feature_metadata, time_limit=1800, # 30 minutes (GPU recommended) ) 53

Multimodal Prediction (4/4): Evaluation Step 5: Evaluating with leaderboard After
training, check the performance of each model on the leaderboard. # Display the leaderboard on the test data leaderboard = predictor_mm.leaderboard(test_data) print(leaderboard) The leaderboard will show the performance of the AG_AUTOMM model ( MultiModalPredictor ) alone, as well as the WeightedEnsemble model that combines it with other tabular models. This allows you to see how much integrating information from different modalities contributed to the prediction accuracy. 54

Advanced Topics & Deployment Practical Knowledge for Production 55

Inside Automated Feature Engineering (1/2) AutoGluon saves users the effort
of manual feature engineering. The following processes are performed automatically under the hood: Datetime: Automatically converted into multiple numerical features like year , month , day , dayofweek . Text: N-gram Features: Extracts word and character N-grams to generate high-dimensional sparse feature vectors. Special Features: Calculates statistical features of the text, such as word count, character count, and proportion of uppercase letters. 56

Inside Automated Feature Engineering (2/2) Categorical Variables: High-cardinality variables are
handled efficiently through internal processes like embedding. Missing Values: Intelligently imputed. Numerical data is filled with the median, while categorical data is treated as a special "missing" category. Why is it better not to do it manually? AutoGluon optimizes these processes considering the characteristics of the subsequent models (tree-based models and NNs). Manual one-hot encoding, for example, can conflict with AutoGluon's internal logic and may actually degrade performance. 57

Deployment: AutoGluon on AWS SageMaker Trained models can be easily
deployed to production, especially within the AWS environment. Option 1: Real-time Inference Endpoint Build a service that returns predictions instantly for API requests. Option 2: Batch Transform Job Generate predictions for a large dataset offline in a batch process. Option 3: autogluon.cloud A library that abstracts away the details of SageMaker, enabling cloud-based training and inference with just a few lines of code. 58

SageMaker Deployment Example: Real-time Endpoint (1/2) Step 1: Prepare and
Upload the Model Compress the trained predictor folder into model.tar.gz and upload it to S3. Step 2: Create an Inference Script ( serve.py ) Prepare a Python script for SageMaker to load the model and make predictions inside the container. # Example serve.py from autogluon.tabular import TabularPredictor def model_fn(model_dir): """Function to load the model""" model = TabularPredictor.load(model_dir) return model def transform_fn(model, data, content_type, accept_type): """Function to predict from request data""" predictions = model.predict(data) return predictions # Format as needed 59

SageMaker Deployment Example: Real-time Endpoint (2/2) Step 3: Execute Deployment
Deploy the model as an endpoint using the SageMaker Python SDK. from sagemaker.pytorch import PyTorchModel # Specify the model data on S3 model_data = 's3://your-bucket/path/to/model.tar.gz' autogluon_model = PyTorchModel( model_data=model_data, role='arn:aws:iam::...:role/SageMakerRole', entry_point='serve.py', # Inference script source_dir='./scripts', # Directory containing the script framework_version='1.12', # PyTorch version py_version='py38' ) # Deploy to an endpoint predictor_sm = autogluon_model.deploy( initial_instance_count=1, instance_type='ml.m5.large', ) 60

Best Practices for Performance and Accuracy (1/2) To Maximize Accuracy:
Use presets='best_quality' . Set a sufficiently long time_limit (several hours to a day). Consider increasing num_bag_folds or num_stack_levels . To Maximize Training/Inference Speed: Use presets='medium_quality' for prototyping. If fast inference is required, choose presets='good_quality' or presets='high_quality' . Use predictor.refit_full() or predictor.distill() to create a fast single model. 61

Best Practices for Performance and Accuracy (2/2) Memory Management: For
very large datasets, it is recommended to experiment with a subsample first. Using a GPU is almost mandatory for multimodal and deep learning models. Data Preparation: While AutoGluon can handle raw data, removing clear outliers or applying log transformations to highly skewed data can still be beneficial. Avoid manual one-hot encoding or missing value imputation. AutoGluon has more optimized internal processing logic. 62

Summary: Key Strengths of AutoGluon State-of-the-Art Performance Industry-leading prediction accuracy,
proven by independent benchmarks. Ease of Use Abstracts highly complex ML pipelines into just a few lines of code. Robustness The multi-layer ensemble approach is resilient to overfitting and single model failures. Versatility Supports tabular, image, text, time series, and their combinations (multimodal). Extensibility Provides customization and extension points to meet the needs of experts. 63

AutoGluon Use Cases Rapid Prototyping Quickly build a strong baseline
model for any supervised learning problem. Data Science Competitions (e.g., Kaggle) A powerful weapon for achieving high rankings with minimal effort. Production ML Systems Build robust, high-performance models, especially when combined with cloud services like AWS SageMaker. Benchmarking Use as a "gold standard" to evaluate the performance of your own custom models. 64

For Further Learning Official Documentation (Stable): https://auto.gluon.ai/stable/index.html GitHub Repository: https://github.com/autogluon/autogluon
Hands-on Tutorials: The official documentation provides numerous tutorials, including the topics covered in this presentation. 65

Papers and Citations When using AutoGluon in academic research, it
is recommended to cite the following key papers. AutoGluon-Tabular: Erickson, Nick, et al. "AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data." arXiv preprint arXiv:2003.06505 (2020). AutoGluon-TimeSeries: Shchur, Oleksandr, et al. "AutoGluon-TimeSeries: AutoML for Probabilistic Time Series Forecasting." The International Conference on Automated Machine Learning (AutoML), 2023. AutoGluon-Multimodal (AutoMM): Zhiqiang, Tang, et al. "AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models", The International Conference on Automated Machine Learning (AutoML), 2024. 66

Q&A Thank you for your attention. 67

AutoGluon: State-of-the-Art Automated Machine L...

AutoGluon: State-of-the-Art Automated Machine Learning (English)

More Decks by MIKIO KUBO

Other Decks in Programming

Featured

Transcript