Toward the Automatic Assessment of Text Exercises

1 Chair for Applied Software Engineering Department of Informatics Technical
University of Munich Toward the Automatic Assessment of Text Exercises Jan Philip Bernius and Bernd Bruegge 2nd Workshop on Innovative Software Engineering Education 19.02.2019 in Stuttgart, Germany

Chair for Applied Software Engineering Department of Informatics Technical University
of Munich 363 496 879 747 785 1.043 1.142 1.431 1.625 1.950 0 500 1000 1500 2000 2500 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 ▪ Participants in the lecture "Introduction to Software Engineering“ (EIST) has risen sharply. ▪ Focus on face-to-face teaching with weekly homework exercises. ▪ Summer semester 2019: 1950 students enrolled. Student population grew by factor five 2 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises

of Munich Submits Solution Student Instructor Assesses Solution manually Receives Assessment Gets Homework Solves Homework Grading & Feedback Traditional Homework Workflow 3 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises

of Munich ▪ In addition to traditional homework, EIST also offers up to four interactive in-class exercises per lecture [3]. ▪ Expected grading load for summer 2019 with 1950 students in 13 lectures: □ each with 4 interactive exercises à 4 × 13 × 1950 ≈ 100.000 □ 13 homework exercises à 3 × 13 × 1950 ≈ 75.000 ▷ EIST Instructors need to assess ≈ 175.000 exercises 30 Tutors need to grade 450 exercises every week. What are effects on grading quality? ▪ Goal: Automate the grading Grading Nightmare 4 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises [3] S. Krusche, A. Seitz, J. Börstler, and B. Bruegge, “Interactive Learning: Increasing Student Participation through Shorter Exercise Cycles,” in 19th Australasian Computing Education Conf. ACM, 2017, pp. 17–26.

of Munich Assessment of Interactive Exercises Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises Allow students to refine their solution in the spirit of interactive learning [1]. 5 Submit solution Student ArTEMiS no yes Automatic assessment possible? Assess automatically Train Assessment Model Review assessment yes no Refine solution Satisfied? «affects» Submission Assessment Calculate Total Score Assessment Model Automatic Feedback Submit solution Student ArTEMiS Instructor no yes Automatic assessment possible? Assess manually Manual Feedback Assess automatically Train Assessment Model Review assessment no Refine solution Satisfied? «affects» Submission Assessment Calculate Total Score Assessment Model Automatic Feedback [1] S. Krusche and A. Seitz, “Increasing the Interactivity in Software Engineering MOOCs - A Case Study,” in 31th Conference on Software Engineering Education and Training, 2019.

of Munich Submit solution Student ArTEMiS Instructor no yes Automatic assessment possible? Assess manually Manual Feedback Assess automatically Train Assessment Model Review assessment yes no Reﬁne solution Satisﬁed? «affects» Submission Assessment Calculate Total Score Assessment Model Automatic Feedback Toward the Automatic Assessment of Interactive Exercises Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises ArTEMiS: An Automatic Assessment Management System for Interactive Learning [7] 6 [7] S. Krusche and A. Seitz, “ArTEMiS - An Automatic Assessment Management System for Interactive Learning,” in 49th Technical Symposium on Computer Science Education. ACM, 2018. (Semi-)Automatic assessment of modeling- and programming exercises

of Munich ▪ In addition to traditional homework, EIST also offers up to four interactive in-class exercises per lecture [3]. ▪ Expected grading load for summer 2019 with 1950 students in 13 lectures: □ each with 4 interactive exercises à 4 × 13 × 1950 ≈ 100.000 □ 13 homework exercises à 3 × 13 × 1950 ≈ 75.000 ▷ EIST Instructors need to assess ≈ 175.000 exercises 30 Tutors need to grade 450 exercises every week. What are effects on grading quality? ▪ Goal: Automate the grading Grading Nightmare 7 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises [3] S. Krusche, A. Seitz, J. Börstler, and B. Bruegge, “Interactive Learning: Increasing Student Participation through Shorter Exercise Cycles,” in 19th Australasian Computing Education Conf. ACM, 2017, pp. 17–26.

of Munich Submit solution Student ArTEMiS Instructor no yes Automatic assessment possible? Assess manually Manual Feedback Assess automatically Train Assessment Model Review assessment yes no Refine solution Satisfied? «affects» Submission Assessment Calculate Total Score Assessment Model Automatic Feedback Toward the Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises ArTEMiS: An Automatic Assessment Management System for Interactive Learning [7] 8 [7] S. Krusche and A. Seitz, “ArTEMiS - An Automatic Assessment Management System for Interactive Learning,” in 49th Technical Symposium on Computer Science Education. ACM, 2018. Submit solution Student ArTEMiS Instructor no yes Automatic assessment possible? Assess manually Manual Feedback Assess automatically Train Assessment Model Review assessment yes no Refine solution Satisfied? «affects» Submission Assessment Calculate Total Score Assessment Model Automatic Feedback ? Automatic Assessment of Text Exercises

of Munich Research Question: Can assessments of textual exercises be automated? Hypothesis: Assessments can be automated using a machine learning based approach. Core Idea: 9 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises Automatic Assessment of Text Exercises

of Munich ▪ Submission consists of multiple text blocks ▪ Instructors assess textual Submissions using multiple Text Blocks. ▪ Instructors can define Text Blocks themselves. ▪ Focus on lower spectrum of the revised Bloom’s Taxonomy. Exercise: Strategy pattern vs. Bridge Pattern Problem Statement: Explain the diﬀerence between the bridge pattern and the strategy pattern. Score: 2 / 6 Reviewer: Jan Philip Bernius The bridge pattern in meant to decouple an abstraction from is implementation. Both patterns are structural patterns. The strategy pattern is a structural pattern and allows providing multiple algorithms at compile time. Student Submission: Assessments: Assess Manual Assessment 10 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises [14] D. Krathwohl, “A revision of bloom’s taxonomy: An overview,” Theory into Practice, vol. 41, no. 4, pp. 212–218, 2002.

of Munich ▪ Submission consists of multiple text blocks ▪ Instructors assess textual Submissions using multiple Text Blocks. ▪ Instructors can define Text Blocks themselves. ▪ Focus on lower spectrum of the revised Bloom’s Taxonomy. Exercise: Strategy pattern vs. Bridge Pattern Problem Statement: Explain the diﬀerence between the bridge pattern and the strategy pattern. Score: 2 / 6 Reviewer: Jan Philip Bernius Score for " " Score: Feedback: The bridge pattern in meant to decouple an abstraction from is implementation. Assess Both patterns are structural patterns. The strategy pattern is a structural pattern and allows providing multiple algorithms at compile time. The bridge pattern in meant to decouple an abstraction from is implementation. 2 Correct Score for " " Score: Feedback: The strategy pattern in an structural pattern and allows providing multiple algorithms at 0 The strategy patterns is a behavioral pattern. It is used to select an algorithm at runtime. compile time. Student Submission: Assessments: Manual Assessment 11 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises [14] D. Krathwohl, “A revision of bloom’s taxonomy: An overview,” Theory into Practice, vol. 41, no. 4, pp. 212–218, 2002.

of Munich Submit solution Student ArTEMiS Instructor no yes Automatic assessment possible? Assess manually Manual Feedback Assess automatically Train Assessment Model Review assessment yes no Reﬁne solution Satisﬁed? «affects» Submission Assessment Calculate Total Score Assessment Model Automatic Feedback Assessment Workflow Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises 12

of Munich Text Exercise problemStatement participate() Submission solution submit() TextBlock phrase Student sampleSolution Event Flow of Assess Automatically Use Case 13 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises Assumption: A single Feedback item can be valid for text blocks from multiple submissions. Split Submission into Text Blocks Text Block Text Block Text Block

of Munich Split Submission into Text Blocks Calculate Vector Representation Text Block Vector Representation Text Block Text Block Text Block Event Flow of Assess Automatically Use Case 14 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises Assumption: A single Feedback item can be valid for text blocks from multiple submissions. Text Exercise problemStatement participate() Submission solution submit() TextBlock phrase Student VectorRepresentation sampleSolution

of Munich ✱ Text Exercise problemStatement participate() ✱ Submission solution submit() TextBlock phrase Student ✱ SimilarityCluster VectorRepresentation ✱ AssessmentModel sampleSolution Event Flow of Assess Automatically Use Case 15 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises Assumption: A single Feedback item can be valid for text blocks from multiple submissions. Split Submission into Text Blocks Calculate Vector Representation Text Block Vector Representation Find Similarity Cluster of Text Blocks Similarity Cluster Text Block Text Block Text Block

of Munich Text Exercise problemStatement participate() 0..1 Submission solution submit() TextBlock phrase 0..1 Feedback score comment Manual Feedback provide() Assessment score Student Instructor Automatic Feedback conﬁdence SimilarityCluster VectorRepresentation AssessmentModel sampleSolution Split Submission into Text Blocks Calculate Vector Representation Text Block Vector Representation Find Similarity Cluster of Text Blocks Similarity Cluster Find existing Feedback in Similarity Cluster Feedback Text Block Text Block Text Block Event Flow of Assess Automatically Use Case 16 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises Assumption: A single Feedback item can be valid for text blocks from multiple submissions.

of Munich Stage 1: Compare automatic assessment with manual assessments in a simulation. Hypothesis 1: Our automatic Assessment mechanism produce results within accuracy greater than 85% compared with manual assessments. Stage 2: Evaluate automatic assessment workflow in a EIST 2019 lecture. Hypothesis 2: Employing automatic assessment can save more than 50% compared with manual assessments. Evaluation Plan: Compute the assessment accuracy 17 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises

of Munich ▪ Idea for face-to-face lectures, can be generalized to MOOCs. ▪ Mechanism to reduce manual grading overhead. ▪ Aim at increasing amount and quality of feedback. ▪ Text block based assessment. Similarity based reuse of feedback. ▪ Future work: □ Optimize manual assessment order □ Advanced method for deriving text blocks from submission ▪ Status: Collection of 200 data points. ▪ Next step: Simulation Conclusion & Status 18 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises

19 Chair for Applied Software Engineering Department of Informatics Technical
University of Munich Toward the Automatic Assessment of Text Exercises Jan Philip Bernius and Bernd Bruegge 2nd Workshop on Innovative Software Engineering Education 19.02.2019 in Stuttgart, Germany

Toward the Automatic Assessment of Text Exercises

Toward the Automatic Assessment of Text Exercises

Dr. Jan Philip Bernius

Other Decks in Research

Featured

Transcript

1 Chair for Applied Software Engineering Department of Informatics Technical

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

Chair for Applied Software Engineering Department of Informatics Technical University

19 Chair for Applied Software Engineering Department of Informatics Technical