Slide 1

Slide 1 text

PyStruct Structured Prediction in Python Andreas Mueller (NYU Center for Data Science, scikit-learn)

Slide 2

Slide 2 text

Structured Prediction

Slide 3

Slide 3 text

Why structure?

Slide 4

Slide 4 text

Applications: Multi-Label Classification

Slide 5

Slide 5 text

Applications: Multi-Label Classification

Slide 6

Slide 6 text

Applications: Sequence Tagging

Slide 7

Slide 7 text

Applications: Sequence Tagging

Slide 8

Slide 8 text

Applications: Image Segmentation

Slide 9

Slide 9 text

Careful: Math ahead

Slide 10

Slide 10 text

The Essence of Structured Prediction

Slide 11

Slide 11 text

The Essence of Structured Prediction

Slide 12

Slide 12 text

Pairwise Structured Models

Slide 13

Slide 13 text

Pairwise Structured Models y 1 y 2 y 3 y 4

Slide 14

Slide 14 text

PyStruct Architecture Estimator = Learner + Model + Inference

Slide 15

Slide 15 text

PyStruct Architecture Estimator = Learner + Model + Inference model = ChainCRF(inference="max_product") ssvm = OneSlackSSVM(model=model, C=.1, inference_cache=50,, tol=0.1, verbose=3) ssvm.fit(X_train, y_train)

Slide 16

Slide 16 text

Sequence Tagging example model = ChainCRF(inference="max_product") ssvm = OneSlackSSVM(model=model, C=.1, inference_cache=50,, tol=0.1, verbose=3) ssvm.fit(X_train, y_train)

Slide 17

Slide 17 text

Sequence Tagging example model = ChainCRF(inference="max_product") ssvm = OneSlackSSVM(model=model, C=.1, inference_cache=50,, tol=0.1, verbose=3) ssvm.fit(X_train, y_train)

Slide 18

Slide 18 text

The Devil is in the Inference y 1 y 2 y 3 y 4 Easy: Dynamic Programming

Slide 19

Slide 19 text

The Devil is in the Inference Easy: Dynamic Programming

Slide 20

Slide 20 text

The Devil is in the Inference HARD! AD3, QPBO, LP, Loopy BP, ….

Slide 21

Slide 21 text

Grid Graphs: Snakes

Slide 22

Slide 22 text

Grid Graphs: Snakes crf = EdgeFeatureGraphCRF(inference_method='qpbo') ssvm = OneSlackSSVM(crf, inference_cache=50, C=.1, tol=.1, switch_to='ad3', n_jobs=1) ssvm.fit(X_train_edge_features, Y_train_flat)

Slide 23

Slide 23 text

Grid Graphs: Snakes

Slide 24

Slide 24 text

Implemented Methods Estimator = Learner + Model + Inference ● Learner: SubgradientSSVM, StructuredPerceptron, OneSlackSSVM, LatentSSVM ● Model: BinaryClf, MultiLabelClf, ChainCRF, GraphCRF, EdgeFeatureGraphCRF ● Inference: Linear Programming, QPBO (PyQPBO), Dual Decomposition (AD3), Message Passing, everything (OpenGM)

Slide 25

Slide 25 text

Classes of Inference Algorithms Exact Algorithms Max-Product (Chains, Trees) 'max-product' Exhaustive (usually too expensive) Relaxed algorithms + branch & bound ('ad3', {'branch_and_bound': True}) Relaxed Linear Programming (slooow) 'lp' Dual Decomposition 'ad3' Approximate / heuristics Loopy message passing 'max-product' QPBO 'qpbo'

Slide 26

Slide 26 text

Classes of Inference Algorithms Exact Algorithms Max-Product (Chains, Trees) 'max-product' Exhaustive (usually too expensive) Relaxed algorithms + branch & bound ('ad3', {'branch_and_bound': True}) Relaxed Linear Programming (slooow) 'lp' Dual Decomposition 'ad3' Approximate / heuristics Loopy message passing 'max-product' QPBO 'qpbo' Install OpenGM for many more!

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

Thank you for your attention. @t3kcit @amueller [email protected] http://amueller.github.io