traditional filtering technologies by using images: • Sextortion, phishing, etc... • Solution : Apply Computer Vision to extract relevant content: • Text, logo, etc... • Existing technologies: Google Vision, Microsoft Azure Computer Vision • Limitations: List of logos is fixed • Decision: Build our own logo detection technology Example of phishing with image attached to email, no relevant content in body Brand logo Typical phishing text
draw bounding box and label logo • Minimal size for a logo is 40x30 • There may be several logos in an image • There may be variants of a brand logo: different geometry, different color, etc. → Same label Wells Fargo Yahoo! Scaling is costly as annotation is manual
Increase resilience of CNN regarding logo position → Logos are often in the same position (top, top left) which increases CNN overfitting • Second purpose: Fit image to 512x512 square CNN input • How? Move a 512x512 sliding window on image and keep image if at least one logo is visible Image with two logos (Office, Microsoft) Image is rejected Image is kept
diversity of images to reduce CNN overfitting → Collected images often have a similar look & feel • Second purpose: Automate annotation to ease scaling → Annotation of collected images is manual: costly, time consuming • Generation is based on ‘randomness’: • Choice of resources: Images, words, fonts • Position of logo • Alterations of logo: down sampling, color balance, contrast, scaling Logos Images Dictionaries Fonts Randomness Generate annotated images
and prediction • CNN input size increased to 512x512 • Transfer learning: • Models are pre-trained on ImageNet (~14M images, 20K classes) • Additional training performed with training corpus Transfer learning is the improvement of learning in a new task through the transfer of knowledge from a related task that has already been learned. (Lisa Torrey and Jude Shavlik, University of Wisconsin)
510 Total 635 • Comparison with Google Vision logo detection • Google Vision logo detection : • General purpose (2D and 3D) • Number of logos supported unknown (>200) • Vade Secure logo detection : • 2D only (3D logo irrelevant in the context of threat detection) • Number of logos supported: 66 • Only logos supported by both are considered • Test set is used (independent from training set) Test set 𝑟𝑟𝑟𝑟𝑟𝑟 1 score Vade 0.95 0.94 0.94 Google 0.98 0.76 0.86 • Vade outperforms Google Vision • High number of FN for Google Vision 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 = + 𝐹𝐹 𝑟𝑟𝑟𝑟𝑟𝑟 = + 1 = 2 . 𝑟𝑟𝑟𝑟𝑟𝑟 + 𝑟𝑟𝑟𝑟𝑟𝑟 Metrics for evaluation:
Extract images Cluster images Analyze and label images QR Code Scanner QR Code are used for crypto payments Optical Character Recognition Natural Language Processing Logo Detection Classify images Images blacklist Global Network Intelligence (GNI)
no relevant text in body Clue 1: Chase logo How? Extract logo with logo detection API Clue 2:Typical phishing text How? Extract text with OCR and classify text with NLP