$30 off During Our Annual Pro Sale. View Details »

The Possibilities of FPGA for Deep Learning

k-mats
January 18, 2017

The Possibilities of FPGA for Deep Learning

A survey on use of FPGA to Deep Learning

k-mats

January 18, 2017
Tweet

More Decks by k-mats

Other Decks in Technology

Transcript

  1. The Possibilities of FPGA
    for Deep Learning
    Kohei Matsumoto @kmats_

    View Slide

  2. Challenges of Hardware
    for Deep Learning
    • Performance
    • Power efficiency
    • Hardware cost
    • Memory bandwidth
    • It is required to pass data from layer to layer
    • Processing bandwidth
    • How many data are processed simultaneously?
    • etc. (Re-programmability, Ease of use, …)
    https://www.altera.com/en_US/pdfs/literature/solution-sheets/efficient_neural_networks.pdf

    View Slide

  3. FPGA?
    • Field Programmable Gate Array
    • A “reconfigurable” hardware by Hardware Description
    Language
    • Pros
    • Can re-program any kind of logics
    • Cons
    • Lack of resources (processing elements, memory, etc)
    • Hardware cost (compared to mass-produced devices)

    View Slide

  4. Use-cases of GPU/FPGA
    • GPU: Massive parallel operations
    • Graphic processing, a sort of scientific
    simulations, etc.
    • FPGA: Prototyping of ASICs, Hardware-wise speed
    is needed and yet logics can be changed, etc.
    • Search engine accelerator, financial simulation,
    high frequency trading, etc.

    View Slide

  5. http://dea.unsj.edu.ar/sda/FPGA_On_Mars.pdf

    View Slide

  6. GPU: De facto standard of
    Deep Learning… why?
    • Deep Learning ~= a variation of
    Convolutional Neural Network (CNN)
    • CNN ~= Massive parallel product-accumulate
    operations GPU! Yay!
    • The learning phase needs enormous
    computing resources (FPGA cannot provide
    enough resources)

    View Slide

  7. FPGA over GPU
    in terms of Deep Learning
    • Pros %
    • Power Efficiency (Performance per Watt)
    • Cons &
    • Difficult implementation
    • Lack of memory bandwidth
    • Lack of processing elements for training
    • Most papers discuss only the inference phase?
    https://www.tractica.com/automation-robotics/fpgas-challenge-gpus-as-a-platform-for-deep-learning/

    View Slide

  8. Example:
    CNN Accelerator by Microsoft
    • “Single-node deep CNN accelerator on a
    mid-range FPGA” (only the inference phase)
    • “Respectable performance relative to prior
    FPGA designs and high-end GPGPUs at a
    fraction of the power”
    https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/CNN20Whitepaper.pdf

    View Slide

  9. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/CNN20Whitepaper.pdf

    View Slide

  10. Binarized Neural Network:
    Highly optimized on FPGA?
    • Binarizes input, output and weights deterministically
    • Stored/Updated weights retain precision
    • “At test phase, BDNNs are fully binarized and can be
    implemented in hardware with low circuit
    complexity”
    • which means the learning phase is not yet fully
    binarized
    https://arxiv.org/abs/1602.02505v2

    View Slide

  11. https://arxiv.org/abs/1602.02505v2

    View Slide

  12. Wrap-up
    • FPGA: a re-programmable hardware
    • Power-efficient with optimal logic
    • Lack of computing resources
    • CNN is too big to be implemented - needs to be simplified
    • An approach: Binarized Neural Network
    • It is yet hard to binarize the learning phase fully

    View Slide

  13. References
    • Efficient Implementation of Neural Network Systems Built on FPGAs, and Programmed with OpenCL
    • https://www.altera.com/en_US/pdfs/literature/solution-sheets/efficient_neural_networks.pdf
    • FPGAs on Mars
    • http://dea.unsj.edu.ar/sda/FPGA_On_Mars.pdf
    • FPGAs Challenge GPUs as a Platform for Deep Learning
    • https://www.tractica.com/automation-robotics/fpgas-challenge-gpus-as-a-platform-for-deep-
    learning/
    • Accelerating Deep Convolutional Neural Networks Using Specialized Hardware
    • https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/CNN20Whitepaper.pdf
    • Banalized Neural Networks
    • https://arxiv.org/abs/1602.02505v2

    View Slide