Upgrade to Pro — share decks privately, control downloads, hide ads and more …

EDM-2017: Few hundred parameters outperform few hundred thousand thousand?

8ee9106f551806f5ecea96b9221e970e?s=47 Amar
June 26, 2017

EDM-2017: Few hundred parameters outperform few hundred thousand thousand?

Knowledge Tracing plays a key role to personalize learning in an Intelligent Tutoring System including funtoot. Bayesian Knowledge Tracing, apart from other models, is the simplest well-studied model which is known to work well. Recently, Deep Knowledge Tracing based on Deep Neural Networks, was proposed with huge promises. But, soon after, it was discovered that the gains achieved by DKT were not of significant magnitude as compared to Performance Factor Analysis and BKT and its variants proposed. In the quest of examining and studying these models, we experiment with them on our dataset. We also introduce a logical extension of DKT, Multi-Skill DKT, to incorporate items requiring knowledge of multiple skills. We show that PFA clearly outperforms all the above mentioned models when the AUC results were averaged on skills while PFA and DKT, both were equally good, when they were averaged on all data points.

8ee9106f551806f5ecea96b9221e970e?s=128

Amar

June 26, 2017
Tweet

More Decks by Amar

Other Decks in Technology

Transcript

  1. Few hundred parameters outperform few hundred thousand? Amar Lalwani, Sweety

    Agrawal
  2. Our Study: Goal • Knowledge Tracing • BKT (Bayesian Knowledge

    Tracing) • Extensions of BKT (Khajah et al., 2016) • PFA (Performance Factor Analysis) • DKT (Deep Knowledge Tracing) • Funtoot data • LG (Learning Gap) as skill
  3. Funtoot: Ontology LG 1 LG 2 LG 3 Rules of

    Congruency Applications of Congruency contains contains Math Triangles Congruency contains contains depends on induces Subject Concept Sub-concept Sub-sub- concept Learning Gaps Image source: From paper “Few hundred parameters outperform few hundred thousand?”
  4. Sub-sub-concept Difficulty Level 2 Difficulty Level 1 Difficulty Level 3

    Difficulty Level 4 Difficulty Level 5 Most Difficult Least Difficult
  5. LG (Learning Gap) as a skill • What made student

    to take an unsuccessful attempt • Possible reason/explanation behind wrong answer • A misunderstanding of a concept • Lack of knowledge about a concept • Each incorrect pattern/response is tagged with one or more LGs • Need to know all possible incorrect patterns/responses
  6. LG: committance and avoidance • 1: Avoidance • 0: Committance

    • Consider a question with 3 LGs Attempt No. LG1 LG2 LG3 Status 1 0 1 1 Failure 2 0 1 1 Failure 3 0 0 1 Failure 4 1 1 1 Success Overall Outcome 0 0 1
  7. Dataset • 6th Grade Math CBSE Curriculum • 22 topics,

    69 sub-topics, 119 sub-sub-topics • 442 LGs, 1523 problems • 7780 students, 176 schools • 2.4 million problem attempts • 5.6 million data-points • 76% avoidances (positive class:1)
  8. Data Distribution Image source: From paper “Few hundred parameters outperform

    few hundred thousand?”
  9. Knowledge Tracing Models • BKT • BKT • BKT+F (Forgetting)

    • BKT+A (Abilities) • BKT+S (Skill Discovery) • BKT+FA • BKT+FSA • DKT • DKT • Multi-Skill DKT • PFA
  10. Hypothetical Example • Student Alice is working on funtoot •

    Consider LGs: A,B,C TimeStamp Questi on A B C T1 Q1 1 0 0 T2 (T2>T1) Q2 N.A. 0 1
  11. BKT Skill Response Series A 1 B 0, 0 C

    0, 1 TimeStamp Questi on A B C T1 Q1 1 0 0 T2 (T2>T1) Q2 N.A. 0 1
  12. PFA Skill # Failures # Successes Response A 0 0

    1 B 0 0 0 C 0 0 0 B 1 0 0 C 1 0 1 TimeStamp Questi on A B C T1 Q1 1 0 0 T2 (T2>T1) Q2 N.A. 0 1
  13. DKT Serial No. Question Input Skill Response Output 1 Q1

    0, 0, 0, 0, 0, 0 A 1 1, X, X 2 Q1 1, 1, 0, 0, 0, 0 B 0 X, 0, X 3 Q1 0, 0, 1, 0, 0, 0 C 0 X, X, 0 4 Q2 0, 0, 0, 0, 1, 0 B 0 X, 0, X 5 Q2 0, 0, 1, 0, 0, 0 C 1 X, X, 1 TimeStamp Questi on A B C T1 Q1 1 0 0 T2 (T2>T1) Q2 N.A. 0 1
  14. DKT: skills randomly shuffled Serial No. Question Input Skill Response

    Output 1 Q1 0, 0, 0, 0, 0, 0 B 0 X, 0, X 2 Q1 0, 0, 1, 0, 0, 0 A 1 1, X, X 3 Q1 1, 1, 0, 0, 0, 0 C 0 X, X, 0 4 Q2 0, 0, 0, 0, 1, 0 C 1 X, X, 1 5 Q2 0, 0, 0, 0, 1, 1 B 0 X, 0, X TimeStamp Questi on A B C T1 Q1 1 0 0 T2 (T2>T1) Q2 N.A. 0 1
  15. Multi-Skill DKT Serial No. Input Output 1 0, 0, 0,

    0, 0, 0 1, 0, 0 2 1, 1, 1, 0, 1, 0 X, 0, 1 TimeStamp Questi on A B C T1 Q1 1 0 0 T2 (T2>T1) Q2 N.A. 0 1
  16. Results Image source: From paper “Few hundred parameters outperform few

    hundred thousand?”
  17. AUC over all data-points • Variance in performance among algorithms

    is very less • PFA & DKT perform equally well • Multi-Skill DKT lags behind DKT (0.03 AUC units) • All variants of BKT lag behind DKT/PFA (0.03-0.05 AUC units) • BKT+FSA & Multi-Skill DKT perform equally well
  18. AUC averaged over skills • The variance in the performance

    among the algorithms is high • PFA (0.88 AUC) performs the best • Gain of 17.3 % over DKT (0.75 AUC) • Gain of 35.3 % over BKT (0.65 AUC) • Multi-Skill DKT lags behind DKT by 0.04 AUC units • DKT & BKT+FSA perform equally well • BKT+F performs the worst with 0.64 AUC
  19. AUC averaged over skills • Forgetting adds no value to

    BKT • BKT: 0.65 AUC, BKT+F: 0.64 AUC • BKT+A: 0.68 AUC, BT+FA: 0.67 AUC • Skill Discovery provides reasonable gains • BKT+S achieved 9 % gain over BKT • BKT+FSA achieved 12 % gain over BKT+FA • 145-175 skills discovered against 442 tagged skills • Adding Abilities saw very small gains of 0.03 AUC units • (BTK, BKT+A), (BKT+F, BKT+FA) • BKT+FSA performed best with 15% gain over BKT
  20. Conclusion • DKT outperforms BKT • BKT Extensions comparable to

    DKT • PFA outperforms DKT • Knowledge Tracing is shallow
  21. Model Parameters • DKT: few hundred thousands • Time Series

    Data: noisy • PFA: 3 x # skills • Coefficients for difficulty, # prior successes, # prior failures • Abstract, simple features • BKT: 4 x # skills • pInit, pLearn, pGuess, pSlip • Parameters: DKT >> PFA • Performance: PFA > DKT
  22. Future Work • 442 skills, 119 sub-sub-topics • Skills Discovered:

    145-175 • Explore DKT for skill discovery • Usage of secondary features • Attempts • Time durations • Hints • Item context and hierarchy
  23. Questions??