Deep Learning for Fonts

Deep Learning for Fonts

We talk about the motivation to build a font classifier, how we did it, challenges and what we aim to achieve

4a1a459a7121d36bbd0ad15b59735b50?s=128

raghothams

May 02, 2019
Tweet

Transcript

  1. Deep Learning for Fonts | Fontastic Deep Learning for Fonts

    | Fontastic Nischal HP | @nischalhp | VP, Engineering, omni:us Raghotham S | @raghothams | Senior Data Scientist, Ericsson Research Strata Data Conference 2019, London Strata Data Conference 2019, London
  2. Fontastic Fontastic Motivation Motivation

  3. Fontastic Fontastic Existing Tools | What The Font Existing Tools

    | What The Font
  4. Fontastic Fontastic Existing Tools | What Font is? Existing Tools

    | What Font is?
  5. Fontastic Fontastic Existing Tools | What Font is? Existing Tools

    | What Font is?
  6. Fontastic Fontastic Existing Tools | Matcherator Existing Tools | Matcherator

  7. None
  8. Fontastic Fontastic What do we aim to do? What do

    we aim to do?
  9. Deep Learning for Humans Deep Learning for Humans Fontastic Upcoming

    Projects
  10. Fontastic Fontastic Agenda Agenda Data acquisition Model building Feature visualization

  11. Fontastic Fontastic Data Acquisition Data Acquisition

  12. Data Acquisition Data Acquisition Pass 1: Scrape Font Squirel -

    Pass 1: Scrape Font Squirel - https:/ /www.fontsquirrel.com/ (https:/ /www.fontsquirrel.com/)
  13. None
  14. Problems 1. We have images of different dimensions 2. Even

    with normalizing the size, we will end up with 5-10 images per style
  15. Data Acquisition Data Acquisition Pass 2: Scrape DaFont - Pass

    2: Scrape DaFont - https:/ /www.dafont.com/ (https:/ /www.dafont.com/)
  16. Problems 1. Old school fonts only, not updated frequently 2.

    Supports only wide dimension, might not work well with inteded end use
  17. Data Acquisition Data Acquisition Pass 3: Generate Image using PIL

    Pass 3: Generate Image using PIL Steps Steps 1. Create 4 set of random texts 2. Generate 4k resolution image using the TTF for every random text 3. Take 10 random crop of size 256x256 px from the 4k image With this we have the ability to generate large number of training images
  18. Data Acquisition Data Acquisition Pass 3: Generate Image using PIL

    Pass 3: Generate Image using PIL Advantages Advantages 1. We control the input text 2. We control the font style and size 3. We control the output image dinemsion
  19. Data Acquisition Data Acquisition Pass 3: Generate Image using PIL

    Pass 3: Generate Image using PIL
  20. Data Acquisition Data Acquisition Pass 3: Generate Image using PIL

    + Random Crop Pass 3: Generate Image using PIL + Random Crop
  21. None
  22. Data Acquisition Data Acquisition Pass 3: Generate Image using PIL

    + Random Crop Pass 3: Generate Image using PIL + Random Crop
  23. Fontastic Fontastic Model Building Model Building

  24. Model Building Model Building Phase I | Feasibility Check -

    FastAI Phase I | Feasibility Check - FastAI In [30]: PATH = "data/" sz=225 arch=resnet50 bs=28 tfms = tfms_from_model(arch, sz, aug_tfms=transforms_side_on, max_zoom=1.1) data = ImageClassifierData.from_paths(PATH, tfms=tfms, bs=bs, num_workers=4) learn = ConvLearner.pretrained(arch, data, precompute=True, ps=0.5) learn.unfreeze() lr=np.array([1e-4,1e-3,1e-2]) learn.fit(lr, 6, cycle_len=1) [0. 1.05857 0.88758 0.6628 ] [1. 0.69731 0.59468 0.77709] [2. 0.51771 0.45974 0.84326] [3. 0.4064 0.35457 0.86119] [4. 0.34457 0.32807 0.87547] [5. 0.26355 0.24554 0.91429]
  25. Model Building - Feasibility Check - FastAI Model Building -

    Feasibility Check - FastAI In [42]: cm = confusion_matrix(y, preds) plot_confusion_matrix(cm, data.classes) [[200 0 3 3 0 3 7] [ 0 94 0 0 0 0 2] [ 3 0 423 1 0 5 0] [ 4 0 0 105 0 0 11] [ 0 0 0 0 179 0 1] [ 3 0 18 0 0 195 0] [ 3 0 0 3 2 2 206]]
  26. Model Building - 70 Fonts - PyTorch Model Building -

    70 Fonts - PyTorch Why PyTorch? Why PyTorch? Easy to customize Flexible to integrate with other visualization projects
  27. Model Building - 70 Fonts - PyTorch Model Building -

    70 Fonts - PyTorch Pretrained Model ResNet50 Pretrained Model ResNet50 What is pretrained model? What is transfer learning?
  28. Image Courtsey - https://medium.com/kansas-city-machine-learning-arti cial- intelligen/an-introduction-to-transfer-learning-in-machine-learning-7efd104b6026

  29. Model Building - 70 Fonts - PyTorch Model Building -

    70 Fonts - PyTorch Hyper Parameter Tuning Hyper Parameter Tuning Learning Rate In [23]: lrf.plot()
  30. Model Building - 70 Fonts - PyTorch Model Building -

    70 Fonts - PyTorch Hyper Parameter Tuning Hyper Parameter Tuning Learning Rate Scheduler In [ ]: # Get Pretrained Model model_ft = models.resnet50(pretrained=True) # Customize FC Layer num_frts = model_ft.fc.in_features model_ft.fc = nn.Linear(num_frts, len(class_names)) model_ft = model_ft.to(device) criterion = nn.CrossEntropyLoss() # Define optimizer & LR Scheduler optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.01, momentum=0.9) exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)
  31. Model Building - 70 Fonts - PyTorch Model Building -

    70 Fonts - PyTorch Result - without LR nder & scheduler Result - without LR nder & scheduler 0.74 f1-score after 40 epochs 0.74 f1-score after 40 epochs
  32. Model Building - 70 Fonts - PyTorch Model Building -

    70 Fonts - PyTorch Result - with LR nder & scheduler Result - with LR nder & scheduler 0.96 f1-score after 40 epochs 0.96 f1-score after 40 epochs
  33. Fontastic Fontastic Feature Visualization Feature Visualization Mechanism to See through

    the eyes of network
  34. Feature Visualization Feature Visualization Gradcam Analysis Gradcam Analysis In [27]:

    import pickle from ipywidgets import interact, interactive, fixed, interact_manual import ipywidgets as widgets import matplotlib.pyplot as plt with open('./fd727d3f-73f4-4ec6-8e89-3e15fd3801b0resnet50_grad_cam', 'rb') as f: data = pickle.load(f) def show_cam(epoch_slider, image_slider, layer_slider): plt.imshow(data[epoch_slider][image_slider][layer_slider])
  35. Feature Visualization Feature Visualization Gradcam Analysis Gradcam Analysis In [28]:

    interact(show_cam, epoch_slider=widgets.IntSlider(min=0, max=len(data)-1, step=1, value= 0), image_slider=widgets.IntSlider(min=0, max=len(data[0])-1, step=1, val ue=0), layer_slider=widgets.IntSlider(min=0, max=len(data[0][0])-1, step=1, value=0)) Out[28]: <function __main__.show_cam(epoch_slider, image_slider, layer_slider)>
  36. Feature Visualization Feature Visualization Activation Atlas Activation Atlas Activation Atlases

    not only reveal visual abstractions within a model, but they can reveal high-level misunderstandings in a model that can be exploited. For example, by looking at an activation atlas we will be able to see why a picture of a baseball can switch the classi cation of an image from “grey whale” to “great white shark”.
  37. Feature Visualization Feature Visualization Activation Atlas - Activation Atlas -

    https:/ /distill.pub/2019/activation-atlas/ (https:/ /distill.pub/2019/activation-atlas/)
  38. Extra Extra Remote Work Remote Work

  39. Open Source Open Source https:/ /github.com/deep-learning-for-humans/fontastic (https:/ /github.com/deep-learning-for-humans/fontastic)

  40. Fin. Fin. GH / twitter / everywhere GH / twitter

    / everywhere @nischalhp @raghothams https://github.com/deep-learning-for-humans/fontastic (https://github.com/deep- learning-for-humans/fontastic)