Train: 29K, Val: 1K, Test:1K • Multi30K-half • Train: 14,500 (En-Img), 14,500 ({De, Fr}-Img), no overlap • Validation: 507 (En-Img), 507 ({De, Fr}-Img), no overlap • Test: 1,000 (En-Img-{De, Fr}) • Preprocess • Sentence: tokenization, Byte Pair Encoding • Image: Faster-RCNN, 36 objects per image 10