Slide 9
Slide 9 text
Feed the extracted features
into a Neural Network
1. Concatenate all the features into a single dense feature vector.
2. Feed this feature vector into a multimodal neural network that
learns how to weight the components and maps high dimensional
feature vectors into lower dimensional shot embeddings.