Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Learning for Fonts

Deep Learning for Fonts

We talk about the motivation to build a font classifier, how we did it, challenges and what we aim to achieve

raghothams

May 02, 2019
Tweet

More Decks by raghothams

Other Decks in Technology

Transcript

  1. Deep Learning for Fonts | Fontastic
    Deep Learning for Fonts | Fontastic
    Nischal HP | @nischalhp | VP, Engineering, omni:us
    Raghotham S | @raghothams | Senior Data Scientist, Ericsson
    Research
    Strata Data Conference 2019, London
    Strata Data Conference 2019, London

    View Slide

  2. Fontastic
    Fontastic
    Motivation
    Motivation

    View Slide

  3. Fontastic
    Fontastic
    Existing Tools | What The Font
    Existing Tools | What The Font

    View Slide

  4. Fontastic
    Fontastic
    Existing Tools | What Font is?
    Existing Tools | What Font is?

    View Slide

  5. Fontastic
    Fontastic
    Existing Tools | What Font is?
    Existing Tools | What Font is?

    View Slide

  6. Fontastic
    Fontastic
    Existing Tools | Matcherator
    Existing Tools | Matcherator

    View Slide

  7. View Slide

  8. Fontastic
    Fontastic
    What do we aim to do?
    What do we aim to do?

    View Slide

  9. Deep Learning for Humans
    Deep Learning for Humans
    Fontastic
    Upcoming
    Projects

    View Slide

  10. Fontastic
    Fontastic
    Agenda
    Agenda
    Data acquisition
    Model building
    Feature
    visualization

    View Slide

  11. Fontastic
    Fontastic
    Data Acquisition
    Data Acquisition

    View Slide

  12. Data Acquisition
    Data Acquisition
    Pass 1: Scrape Font Squirel -
    Pass 1: Scrape Font Squirel - https:/
    /www.fontsquirrel.com/
    (https:/
    /www.fontsquirrel.com/)

    View Slide

  13. View Slide

  14. Problems
    1. We have images of different dimensions
    2. Even with normalizing the size, we will end up with 5-10 images per
    style

    View Slide

  15. Data Acquisition
    Data Acquisition
    Pass 2: Scrape DaFont -
    Pass 2: Scrape DaFont - https:/
    /www.dafont.com/ (https:/
    /www.dafont.com/)

    View Slide

  16. Problems
    1. Old school fonts only, not updated frequently
    2. Supports only wide dimension, might not work well with inteded end
    use

    View Slide

  17. Data Acquisition
    Data Acquisition
    Pass 3: Generate Image using PIL
    Pass 3: Generate Image using PIL
    Steps
    Steps
    1. Create 4 set of random texts
    2. Generate 4k resolution image using the TTF for every random
    text
    3. Take 10 random crop of size 256x256 px from the 4k image
    With this we have the ability to generate large number of training images

    View Slide

  18. Data Acquisition
    Data Acquisition
    Pass 3: Generate Image using PIL
    Pass 3: Generate Image using PIL
    Advantages
    Advantages
    1. We control the input text
    2. We control the font style and size
    3. We control the output image
    dinemsion

    View Slide

  19. Data Acquisition
    Data Acquisition
    Pass 3: Generate Image using PIL
    Pass 3: Generate Image using PIL

    View Slide

  20. Data Acquisition
    Data Acquisition
    Pass 3: Generate Image using PIL + Random Crop
    Pass 3: Generate Image using PIL + Random Crop

    View Slide

  21. View Slide

  22. Data Acquisition
    Data Acquisition
    Pass 3: Generate Image using PIL + Random Crop
    Pass 3: Generate Image using PIL + Random Crop

    View Slide

  23. Fontastic
    Fontastic
    Model Building
    Model Building

    View Slide

  24. Model Building
    Model Building
    Phase I | Feasibility Check - FastAI
    Phase I | Feasibility Check - FastAI
    In [30]: PATH = "data/"
    sz=225
    arch=resnet50
    bs=28
    tfms = tfms_from_model(arch, sz, aug_tfms=transforms_side_on, max_zoom=1.1)
    data = ImageClassifierData.from_paths(PATH, tfms=tfms, bs=bs, num_workers=4)
    learn = ConvLearner.pretrained(arch, data, precompute=True, ps=0.5)
    learn.unfreeze()
    lr=np.array([1e-4,1e-3,1e-2])
    learn.fit(lr, 6, cycle_len=1)
    [0. 1.05857 0.88758 0.6628 ]
    [1. 0.69731 0.59468 0.77709]
    [2. 0.51771 0.45974 0.84326]
    [3. 0.4064 0.35457 0.86119]
    [4. 0.34457 0.32807 0.87547]
    [5. 0.26355 0.24554 0.91429]

    View Slide

  25. Model Building - Feasibility Check - FastAI
    Model Building - Feasibility Check - FastAI
    In [42]: cm = confusion_matrix(y, preds)
    plot_confusion_matrix(cm, data.classes)
    [[200 0 3 3 0 3 7]
    [ 0 94 0 0 0 0 2]
    [ 3 0 423 1 0 5 0]
    [ 4 0 0 105 0 0 11]
    [ 0 0 0 0 179 0 1]
    [ 3 0 18 0 0 195 0]
    [ 3 0 0 3 2 2 206]]

    View Slide

  26. Model Building - 70 Fonts - PyTorch
    Model Building - 70 Fonts - PyTorch
    Why PyTorch?
    Why PyTorch?
    Easy to customize
    Flexible to integrate with other visualization
    projects

    View Slide

  27. Model Building - 70 Fonts - PyTorch
    Model Building - 70 Fonts - PyTorch
    Pretrained Model ResNet50
    Pretrained Model ResNet50
    What is pretrained model? What is transfer learning?

    View Slide

  28. Image Courtsey - https://medium.com/kansas-city-machine-learning-arti cial-
    intelligen/an-introduction-to-transfer-learning-in-machine-learning-7efd104b6026

    View Slide

  29. Model Building - 70 Fonts - PyTorch
    Model Building - 70 Fonts - PyTorch
    Hyper Parameter Tuning
    Hyper Parameter Tuning
    Learning
    Rate
    In [23]: lrf.plot()

    View Slide

  30. Model Building - 70 Fonts - PyTorch
    Model Building - 70 Fonts - PyTorch
    Hyper Parameter Tuning
    Hyper Parameter Tuning
    Learning Rate
    Scheduler
    In [ ]: # Get Pretrained Model
    model_ft = models.resnet50(pretrained=True)
    # Customize FC Layer
    num_frts = model_ft.fc.in_features
    model_ft.fc = nn.Linear(num_frts, len(class_names))
    model_ft = model_ft.to(device)
    criterion = nn.CrossEntropyLoss()
    # Define optimizer & LR Scheduler
    optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.01, momentum=0.9)
    exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

    View Slide

  31. Model Building - 70 Fonts - PyTorch
    Model Building - 70 Fonts - PyTorch
    Result - without LR nder & scheduler
    Result - without LR nder & scheduler
    0.74 f1-score after 40 epochs
    0.74 f1-score after 40 epochs

    View Slide

  32. Model Building - 70 Fonts - PyTorch
    Model Building - 70 Fonts - PyTorch
    Result - with LR nder & scheduler
    Result - with LR nder & scheduler
    0.96 f1-score after 40 epochs
    0.96 f1-score after 40 epochs

    View Slide

  33. Fontastic
    Fontastic
    Feature Visualization
    Feature Visualization
    Mechanism to See through the eyes of network

    View Slide

  34. Feature Visualization
    Feature Visualization
    Gradcam Analysis
    Gradcam Analysis
    In [27]: import pickle
    from ipywidgets import interact, interactive, fixed, interact_manual
    import ipywidgets as widgets
    import matplotlib.pyplot as plt
    with open('./fd727d3f-73f4-4ec6-8e89-3e15fd3801b0resnet50_grad_cam', 'rb') as f:
    data = pickle.load(f)
    def show_cam(epoch_slider, image_slider, layer_slider):
    plt.imshow(data[epoch_slider][image_slider][layer_slider])

    View Slide

  35. Feature Visualization
    Feature Visualization
    Gradcam Analysis
    Gradcam Analysis
    In [28]: interact(show_cam, epoch_slider=widgets.IntSlider(min=0, max=len(data)-1, step=1, value=
    0),
    image_slider=widgets.IntSlider(min=0, max=len(data[0])-1, step=1, val
    ue=0),
    layer_slider=widgets.IntSlider(min=0, max=len(data[0][0])-1, step=1,
    value=0))
    Out[28]:

    View Slide

  36. Feature Visualization
    Feature Visualization
    Activation Atlas
    Activation Atlas
    Activation Atlases not only reveal visual abstractions within a model, but they can reveal
    high-level misunderstandings in a model that can be exploited. For example, by looking at
    an activation atlas we will be able to see why a picture of a baseball can switch the
    classi cation of an image from “grey whale” to “great white shark”.

    View Slide

  37. Feature Visualization
    Feature Visualization
    Activation Atlas -
    Activation Atlas - https:/
    /distill.pub/2019/activation-atlas/
    (https:/
    /distill.pub/2019/activation-atlas/)

    View Slide

  38. Extra
    Extra
    Remote Work
    Remote Work

    View Slide

  39. Open Source
    Open Source
    https:/
    /github.com/deep-learning-for-humans/fontastic
    (https:/
    /github.com/deep-learning-for-humans/fontastic)

    View Slide

  40. Fin.
    Fin.
    GH / twitter / everywhere
    GH / twitter / everywhere
    @nischalhp
    @raghothams
    https://github.com/deep-learning-for-humans/fontastic (https://github.com/deep-
    learning-for-humans/fontastic)

    View Slide