youtube playlists - Those playlists are lengthy to watch - There are duplication in them - Having context around a topic enhances learning - Student populate knowledge bases from multiple sources - Knowledge-Graphs in MultiModal RAG is not well explored - Agent-based RAG, explore interactive self-guided feedback
2. Extract audio embeddings for lecture style emphasis. Multi-Modal Encoding 3. Text encoder for transcripts and video titles. 4. Visual encoder for slides and key frames. 5. Audio encoder for emphasis (intonation/important points).