Feed Forward Reasoner • Multi-step Reasoner • ੑ۱ : ױ҅ ௪ܻ q(t), Reader State S • ۱: ࢜۽ ݅ٚ ௪ܻ • ള۲: рױ҅ ௪ܻ ਸ ٜ݅ ࣻ হਵ۽, ъച ण (݃ա ੜ Retrieverо ߸ਸ ୶ೞחо) • State = (௪ܻ, ߸, ޙࢲ, ࢚ਤ ޙױ kѐ) • Observation = (௪ܻ, Reader State S) • Action = ޙױ ࢶఖ ৈࠗ • Reward = Reader-F1 q′ (t+1) = GRU(qt , S) q(t+1) = FFN(q′ (t+1) )