Slide 1

Slide 1 text

Processing Images in scale with Go Jyotiska NK, DataWeave @jyotiska_nk

Slide 2

Slide 2 text

Image Processing Any form of signal processing for which the input is an image, the output of image processing may be either an image or a set of characteristics or parameters related to the image. (Wikipedia) Compression Segmentation Histogram Scaling Stitching Enhancement Extraction Smoothing Noise Reduction Interpolation Feature Detection Face Detection Edge Detection Background / Foreground Detection

Slide 3

Slide 3 text

Go Libraries ❖ image (github.com/golang/go/src/image) ❖ imagick (github.com/gographics/imagick) ❖ go-opencv ❖ github.com/hybridgroup/gobot ❖ github.com/lazywei/go-opencv ❖ imaging (github.com/disintegration/imaging) ❖ resize (github.com/nfnt/resize) ❖ go-colorful (github.com/lucasb-eyer/go-colorful)

Slide 4

Slide 4 text

image ❖ Part of standard libraries. ❖ Implements a standard 2D image library. ❖ Supports multiple formats - JPEG, GIF, PNG. ❖ image/color implements the basic color library along with color palettes. ❖ image/color/palette provides Plan9 and WebSafe color palettes.

Slide 5

Slide 5 text

imagick ❖ Go bindings for ImageMagick C API. ❖ Provides various ImageMagick methods - ❖ Resize ❖ Grayscale ❖ Tiling ❖ Rotation ❖ Text Effects

Slide 6

Slide 6 text

go-opencv ❖ OpenCV bindings from C APIs. ❖ Basic OpenCV methods - ❖ Hooking up a webcam or a camera ❖ Accessing frames from camera ❖ Face detection ❖ lazywei/go-opencv provides more methods - ❖ Canny Edge Detection ❖ Cropping ❖ Resizing

Slide 7

Slide 7 text

imaging ❖ Basic image manipulation library. ❖ Depends on the standard “image” library. ❖ Provides following functions - ❖ Image encoding, decoding ❖ Cropping and overlaying ❖ Image flipping, rotating, transforming ❖ Image blurring, sharpening ❖ Image resizing

Slide 8

Slide 8 text

resize ❖ Image resizing library written in pure Go. ❖ Provides method to create thumbnails preserving the aspect ratio. ❖ Offers common interpolation methods - ❖ NearestNeighbor ❖ Bilinear ❖ Bicubic ❖ MitchellNetravali ❖ Lanczos2 / Lanczos3

Slide 9

Slide 9 text

go-colorful ❖ Library for working with colors written in Go. ❖ Stores colors in RGB and provides methods to convert colors in different color spaces - Hex RGB, HSV, Linear RGB etc. ❖ Can be used to convert between color spaces, generate random colors or create color palettes.

Slide 10

Slide 10 text

Use case at DataWeave

Slide 11

Slide 11 text

Extract Dominant Colors from Images

Slide 12

Slide 12 text

Extract dominant colors from images

Slide 13

Slide 13 text

Generate Insights based on Product Colors

Slide 14

Slide 14 text

Color Distribution for Apparels on Store #1

Slide 15

Slide 15 text

Color Distribution for Pink Tops on Store #1

Slide 16

Slide 16 text

0 75 150 225 300 January March May July September December Time Series data over 12 months for Store #1

Slide 17

Slide 17 text

Cluster Similar Products based on Color Histogram

Slide 18

Slide 18 text

Clustering Similar Products across Stores Product A in Store #1 Product A in Store #2

Slide 19

Slide 19 text

Doing things at scale ❖ 15 million webpages crawled and refreshed globally everyday. ❖ 40% of crawls are Apparels and Lifestyle products. ❖ 30% of daily crawls are new introduced products. ❖ 2 servers shared with bunch of other services.

Slide 20

Slide 20 text

Existing Architecture

Slide 21

Slide 21 text

API Decode Resize Process Map Cache Response CherryPy + Gunicorn OpenCV + Numpy OpenCV + Numpy MongoDB + PyMongo WebColors OpenCV + Numpy

Slide 22

Slide 22 text

Pain Points ❖ High system usage ❖ Servers are shared with other services (Celery, RabbitMQ etc.) ❖ Running on 4 separate processes maxes out the CPU usage ❖ Average memory usage takes over 70% of entire memory available

Slide 23

Slide 23 text

Bringing Gophers to the action…

Slide 24

Slide 24 text

API Decode Resize Process Map Cache Response net/http image nfnt/resize MongoDB + mgo go-webcolors image

Slide 25

Slide 25 text

Advantages ❖ Cheap concurrency - API is now able to serve more requests. ❖ Cooler servers - System usage never exceeds 50-60% with GOMAXPROCS set to 4. ❖ Easier deployment to multiple servers (thanks to binaries) ❖ Awesome standard libraries - bringing down the external package dependencies.

Slide 26

Slide 26 text

Still a long way to go… ❖ Numerical computing packages. ❖ Not many pure Go libraries for image processing or computer vision related works. ❖ Major OpenCV bindings are lacking, using SWIG to hook up C++ methods is pain! ❖ Need more adopters, more people playing and experimenting :)