Introduction to Sparse Modeling

如何開發可解釋的AI Introduction to Sparse Modeling Takashi Someda CTO, Hacarus Inc.
September 19th, 2019 LIGHTWEIGHT & EXPLAINABLE AI

About Me Takashi Someda, @tksmd Director/CTO at HACARUS Inc. Master’s
degree of Information Science at Kyoto University Sun Microsystems K.K. → Founded own startup at Kyoto → Established Kyoto Branch of Nulab Inc. → Current

Company and Team RAISED: USD 3.7 Million (Series A) FOUNDED:
Jan, 2014 EMPLOYEES: 49 LOCATIONS: Kyoto (HQ), Tokyo, Manila CUSTOMER SUPPORT IN: Japanese, English, German, Tagalog, Chinese and Swedish ACADEMIC PARTNERS

What HACARUS is doing Sparse Modeling Based AI On-Premises Cloud
FPGA / Major CPUs Medical Manufacturing

Today’s Takeaways • Basic concept of Sparse Modeling • Several
concrete use cases / examples • Guide to Python implementation

Blackbox Problem Blackbox AI Target Result Blackbox AI cannot answer
questions like • Why do I get the result ? • When does it succeed/fail ? • How can I correct the result ? difficult to use AI even if it shows fascinating performance

Approach to explainable AI • Post-hoc explain a given AI
model • Individual prediction explanations • Global prediction explanations • Build an interpretable model • Logistic regression, Decision trees and so on For more details, refer to Explainable AI in Industry (KDD 2019 Tutorial)

History of Sparse Modeling Year Paper Author 1996 Regression Shrinkage
and Selection via Lasso R. Tibshirani 1996 Sparse Coding B. Olshausen 2006 Compressed Sensing D.L. Donoho 2018 Multi-Layer Convolutional Sparse Modeling M. Elad

ZUBSHFUWBSJBCMF 9JOQVU = $ + & & + ( (
+ ) ) + * * + + + ⇒ find w to satisfy the condition above between y and X Linear Regression in sales forecast area(m^2) Distance from station(km) Regional Population Competitors Shop ID Products Revenue

⇒ Analytical solution cannot be found 884 = $ +
33& + 1.7( + 41) + 0* + 38+ 554 = $ + 36& + 4.7( + 23) + 0* + 36+ 677 = $ + 29& + 3.2( + 46) + 0* + 36+ 478 = $ + 43& + 3.7( + 25) + 5* + 30+ If no sufficient data exists.. area(m^2) Distance from station(km) Regional Population Competitors Shop ID Products Revenue

• Use minimum number of input features ⇒ Set as
many weight values to 0 as possible Assume most of variables are irrelevant area(m^2) Distance from station(km) Regional Population Competitors Shop ID Products Revenue

• Problem Settings • Output y can be expressed as
linear combination of x with observation noise ε where x is m dimensional and sample size of y is n = & & + ⋯ + < < + Basic approach to the problem • Least squares method • Minimize least square errors of y and multipliers of x and estimated w min 1 2 − (

What if data is not sufficient ? • Additional constraint
will be introduced as regularization term • Objective function can be changed to the following form min 1 2 − ( + C ⇒ Regularization parameter λ controls effectiveness of regularization Introduce Regularization

• L0 norm optimization • Use minimum number of x
to satisfy equation • Find w to minimize number of non-zero elements • Combinational optimization • NP-hard and not feasible L L0 and L1 norm optimization • L1 norm optimization • Find w to minimize sum of its absolute values • Global solution can (still) be reached • Solved within practical time Relax constraint Least Absolute Shrinkage and Selection Operator

1. Initialize D = 1, … , with random value
2. Update I D = K L MN L O , Where (D) = − ∑TUD T T and S is soft thresholding function 3. Repeat 2 until converging condition is satisfied Algorithm for Lasso Coordinate Descent

• Shrink given value x by comparing it with S
, = W − , ( ≥ ) 0, (− < < ) + , ( ≤ −) Soft thresholding operator − (, )

# Soft thresholding operator def soft_threshold(X, thresh): return np.where(np.abs(X) <=
thresh, 0, X - thresh * np.sign(X)) # Coordinate descent w_cd = np.zeros(n_features) for _ in range(n_iter): for j in range(n_features): w_cd[j] = 0.0 r_j = y - np.dot(X, w_cd) w_cd[j] = soft_threshold(np.dot(X[:, j], r_j) / n_samples, alpha) Code: Coordinate Descent

X is 1,000 dimensional input features Only 20 features out
of 1,000 are relevant to output y Only 100 samples are available Example: Numerical experiment

Other Lasso variants Fused Lasso Trend Filtering constraint difference of
two neighboring values Constraint two-fold differential of three neighboring values

1. Extract patches from images 2. Learn dictionary to express
every patches 3. Every patches should be represented as sparse combination of dictionary basis Y: Image A: Dictionary [ ] ] X: coefficient Dictionary Learning

Dictionary (8x8, 64 basis) Sparse Coding (green is 0) Example:
Image Reconstruction Sparse Encode Reconstruction

# extract patches patches = extract_simple_patches_2d(img, patch_size) # normalize patches
patches = patches.reshape(patches.shape[0], -1).astype(np.float64) intercept = np.mean(patches, axis=0) patches -= intercept patches /= np.std(patches, axis=0) # dictionary learning model = MiniBatchDictionaryLearning(n_components=n_basis, alpha=1, n_iter=n_iter, n_jobs=1) model.fit(patches) # reconstruction reconstructed_patches = np.dot(code, model.components_) reconstructed_patches = reconstructed_patches.reshape(len(patches), *patch_size) reconstructed = reconstruct_from_simple_patches_2d(reconstructed_patches, img.shape) Code: Image Reconstruction

Advanced use case of dictionary learning Inpainting Super Resolution

Example: Inspection of solar panel cells Comparison of proposed methods(paper
below) and SPECTRO by HACARUS https://arxiv.org/abs/1807.02894 Paper (SVM) Paper (CNN) SPECTRO Dataset 800 800 60 Training time 30 mins 5 hours 19 secs Inference time 8 mins 20 secs 10 secs Accuracy 85% 86% 90% Monocrystalline modules

Summary

Sparse Modeling in a nutshell • Small dataset, explainable, lightweight
• Applicable to table, image and time series data • Has been developed over 20 years and still evolving

More Python resources for Sparse Modeling https://github.com/hacarus/spm-image/

Thank you !! Visit HACARUS website

Introduction to Sparse Modeling

Introduction to Sparse Modeling

Hacarus Inc.

More Decks by Hacarus Inc.

Other Decks in Technology

Featured

Transcript

如何開發可解釋的AI Introduction to Sparse Modeling Takashi Someda CTO, Hacarus Inc.

About Me Takashi Someda, @tksmd Director/CTO at HACARUS Inc. Master’s

Company and Team RAISED: USD 3.7 Million (Series A) FOUNDED:

What HACARUS is doing Sparse Modeling Based AI On-Premises Cloud

Today’s Takeaways • Basic concept of Sparse Modeling • Several

Blackbox Problem Blackbox AI Target Result Blackbox AI cannot answer

Approach to explainable AI • Post-hoc explain a given AI

History of Sparse Modeling Year Paper Author 1996 Regression Shrinkage

ZUBSHFUWBSJBCMF 9JOQVU = $ + & & + ( (

⇒ Analytical solution cannot be found 884 = $ +

• Use minimum number of input features ⇒ Set as

• Problem Settings • Output y can be expressed as

What if data is not sufficient ? • Additional constraint

• L0 norm optimization • Use minimum number of x

1. Initialize D = 1, … , with random value

• Shrink given value x by comparing it with S

# Soft thresholding operator def soft_threshold(X, thresh): return np.where(np.abs(X) <=

X is 1,000 dimensional input features Only 20 features out

Other Lasso variants Fused Lasso Trend Filtering constraint difference of

1. Extract patches from images 2. Learn dictionary to express

Dictionary (8x8, 64 basis) Sparse Coding (green is 0) Example:

# extract patches patches = extract_simple_patches_2d(img, patch_size) # normalize patches

Advanced use case of dictionary learning Inpainting Super Resolution

Example: Inspection of solar panel cells Comparison of proposed methods(paper

Summary

Sparse Modeling in a nutshell • Small dataset, explainable, lightweight

More Python resources for Sparse Modeling https://github.com/hacarus/spm-image/

Thank you !! Visit HACARUS website