Save 37% off PRO during our Black Friday Sale! »

Toward the Automatic Assessment of Text Exercises

Toward the Automatic Assessment of Text Exercises

Presentation of my paper "Toward the Automatic Assessment of Text Exercises" at the 2nd Workshop on Innovative Software Engineering Education in Stuttgart, Germany

Paper: http://ceur-ws.org/Vol-2308/isee2019paper04.pdf

Abstract:
Exercises are an essential part of learning. Manual assessment of exercises requires efforts from instructors and can also lead to quality problems and inconsistencies between assessments. Especially with growing student populations, this also leads to delayed grading, and it is more and more difficult to provide individual feedback.

The goal is to provide timely responses to homework submissions in large classes. By reducing the required efforts for assessments, instructors can invest more time in supporting students and providing individual feedback.

This paper argues that automated assessment provides more individual feedback for students, combined with quicker feedback and grading cycles. We introduce a concept for automatic assessment of text exercises using machine learning techniques. Also, we describe our plans to use this concept in a case study with 1900 students.

73adf65383702c1b5845e848971fe382?s=128

Jan Philip Bernius, M.Sc.

February 19, 2019
Tweet

Transcript

  1. 1 Chair for Applied Software Engineering Department of Informatics Technical

    University of Munich Toward the Automatic Assessment of Text Exercises Jan Philip Bernius and Bernd Bruegge 2nd Workshop on Innovative Software Engineering Education 19.02.2019 in Stuttgart, Germany
  2. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich 363 496 879 747 785 1.043 1.142 1.431 1.625 1.950 0 500 1000 1500 2000 2500 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 ▪ Participants in the lecture "Introduction to Software Engineering“ (EIST) has risen sharply. ▪ Focus on face-to-face teaching with weekly homework exercises. ▪ Summer semester 2019: 1950 students enrolled. Student population grew by factor five 2 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
  3. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich Submits Solution Student Instructor Assesses Solution manually Receives Assessment Gets Homework Solves Homework Grading & Feedback Traditional Homework Workflow 3 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
  4. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich ▪ In addition to traditional homework, EIST also offers up to four interactive in-class exercises per lecture [3]. ▪ Expected grading load for summer 2019 with 1950 students in 13 lectures: □ each with 4 interactive exercises à 4 × 13 × 1950 ≈ 100.000 □ 13 homework exercises à 3 × 13 × 1950 ≈ 75.000 ▷ EIST Instructors need to assess ≈ 175.000 exercises 30 Tutors need to grade 450 exercises every week. What are effects on grading quality? ▪ Goal: Automate the grading Grading Nightmare 4 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises [3] S. Krusche, A. Seitz, J. Börstler, and B. Bruegge, “Interactive Learning: Increasing Student Participation through Shorter Exercise Cycles,” in 19th Australasian Computing Education Conf. ACM, 2017, pp. 17–26.
  5. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich Assessment of Interactive Exercises Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises Allow students to refine their solution in the spirit of interactive learning [1]. 5 Submit solution Student ArTEMiS no yes Automatic assessment possible? Assess automatically Train Assessment Model Review assessment yes no Refine solution Satisfied? «affects» Submission Assessment Calculate Total Score Assessment Model Automatic Feedback Submit solution Student ArTEMiS Instructor no yes Automatic assessment possible? Assess manually Manual Feedback Assess automatically Train Assessment Model Review assessment no Refine solution Satisfied? «affects» Submission Assessment Calculate Total Score Assessment Model Automatic Feedback [1] S. Krusche and A. Seitz, “Increasing the Interactivity in Software Engineering MOOCs - A Case Study,” in 31th Conference on Software Engineering Education and Training, 2019.
  6. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich Submit solution Student ArTEMiS Instructor no yes Automatic assessment possible? Assess manually Manual Feedback Assess automatically Train Assessment Model Review assessment yes no Refine solution Satisfied? «affects» Submission Assessment Calculate Total Score Assessment Model Automatic Feedback Toward the Automatic Assessment of Interactive Exercises Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises ArTEMiS: An Automatic Assessment Management System for Interactive Learning [7] 6 [7] S. Krusche and A. Seitz, “ArTEMiS - An Automatic Assessment Management System for Interactive Learning,” in 49th Technical Symposium on Computer Science Education. ACM, 2018. (Semi-)Automatic assessment of modeling- and programming exercises
  7. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich ▪ In addition to traditional homework, EIST also offers up to four interactive in-class exercises per lecture [3]. ▪ Expected grading load for summer 2019 with 1950 students in 13 lectures: □ each with 4 interactive exercises à 4 × 13 × 1950 ≈ 100.000 □ 13 homework exercises à 3 × 13 × 1950 ≈ 75.000 ▷ EIST Instructors need to assess ≈ 175.000 exercises 30 Tutors need to grade 450 exercises every week. What are effects on grading quality? ▪ Goal: Automate the grading Grading Nightmare 7 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises [3] S. Krusche, A. Seitz, J. Börstler, and B. Bruegge, “Interactive Learning: Increasing Student Participation through Shorter Exercise Cycles,” in 19th Australasian Computing Education Conf. ACM, 2017, pp. 17–26.
  8. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich Submit solution Student ArTEMiS Instructor no yes Automatic assessment possible? Assess manually Manual Feedback Assess automatically Train Assessment Model Review assessment yes no Refine solution Satisfied? «affects» Submission Assessment Calculate Total Score Assessment Model Automatic Feedback Toward the Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises ArTEMiS: An Automatic Assessment Management System for Interactive Learning [7] 8 [7] S. Krusche and A. Seitz, “ArTEMiS - An Automatic Assessment Management System for Interactive Learning,” in 49th Technical Symposium on Computer Science Education. ACM, 2018. Submit solution Student ArTEMiS Instructor no yes Automatic assessment possible? Assess manually Manual Feedback Assess automatically Train Assessment Model Review assessment yes no Refine solution Satisfied? «affects» Submission Assessment Calculate Total Score Assessment Model Automatic Feedback ? Automatic Assessment of Text Exercises
  9. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich Research Question: Can assessments of textual exercises be automated? Hypothesis: Assessments can be automated using a machine learning based approach. Core Idea: 9 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises Automatic Assessment of Text Exercises
  10. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich ▪ Submission consists of multiple text blocks ▪ Instructors assess textual Submissions using multiple Text Blocks. ▪ Instructors can define Text Blocks themselves. ▪ Focus on lower spectrum of the revised Bloom’s Taxonomy. Exercise: Strategy pattern vs. Bridge Pattern Problem Statement: Explain the difference between the bridge pattern and the strategy pattern. Score: 2 / 6 Reviewer: Jan Philip Bernius The bridge pattern in meant to decouple an abstraction from is implementation. Both patterns are structural patterns. The strategy pattern is a structural pattern and allows providing multiple algorithms at compile time. Student Submission: Assessments: Assess Manual Assessment 10 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises [14] D. Krathwohl, “A revision of bloom’s taxonomy: An overview,” Theory into Practice, vol. 41, no. 4, pp. 212–218, 2002.
  11. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich ▪ Submission consists of multiple text blocks ▪ Instructors assess textual Submissions using multiple Text Blocks. ▪ Instructors can define Text Blocks themselves. ▪ Focus on lower spectrum of the revised Bloom’s Taxonomy. Exercise: Strategy pattern vs. Bridge Pattern Problem Statement: Explain the difference between the bridge pattern and the strategy pattern. Score: 2 / 6 Reviewer: Jan Philip Bernius Score for " " Score: Feedback: The bridge pattern in meant to decouple an abstraction from is implementation. Assess Both patterns are structural patterns. The strategy pattern is a structural pattern and allows providing multiple algorithms at compile time. The bridge pattern in meant to decouple an abstraction from is implementation. 2 Correct Score for " " Score: Feedback: The strategy pattern in an structural pattern and allows providing multiple algorithms at 0 The strategy patterns is a behavioral pattern. It is used to select an algorithm at runtime. compile time. Student Submission: Assessments: Manual Assessment 11 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises [14] D. Krathwohl, “A revision of bloom’s taxonomy: An overview,” Theory into Practice, vol. 41, no. 4, pp. 212–218, 2002.
  12. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich Submit solution Student ArTEMiS Instructor no yes Automatic assessment possible? Assess manually Manual Feedback Assess automatically Train Assessment Model Review assessment yes no Refine solution Satisfied? «affects» Submission Assessment Calculate Total Score Assessment Model Automatic Feedback Assessment Workflow Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises 12
  13. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich Text Exercise problemStatement participate() Submission solution submit() TextBlock phrase Student sampleSolution Event Flow of Assess Automatically Use Case 13 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises Assumption: A single Feedback item can be valid for text blocks from multiple submissions. Split Submission into Text Blocks Text Block Text Block Text Block
  14. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich Split Submission into Text Blocks Calculate Vector Representation Text Block Vector Representation Text Block Text Block Text Block Event Flow of Assess Automatically Use Case 14 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises Assumption: A single Feedback item can be valid for text blocks from multiple submissions. Text Exercise problemStatement participate() Submission solution submit() TextBlock phrase Student VectorRepresentation sampleSolution
  15. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich ✱ Text Exercise problemStatement participate() ✱ Submission solution submit() TextBlock phrase Student ✱ SimilarityCluster VectorRepresentation ✱ AssessmentModel sampleSolution Event Flow of Assess Automatically Use Case 15 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises Assumption: A single Feedback item can be valid for text blocks from multiple submissions. Split Submission into Text Blocks Calculate Vector Representation Text Block Vector Representation Find Similarity Cluster of Text Blocks Similarity Cluster Text Block Text Block Text Block
  16. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich Text Exercise problemStatement participate() 0..1 Submission solution submit() TextBlock phrase 0..1 Feedback score comment Manual Feedback provide() Assessment score Student Instructor Automatic Feedback confidence SimilarityCluster VectorRepresentation AssessmentModel sampleSolution Split Submission into Text Blocks Calculate Vector Representation Text Block Vector Representation Find Similarity Cluster of Text Blocks Similarity Cluster Find existing Feedback in Similarity Cluster Feedback Text Block Text Block Text Block Event Flow of Assess Automatically Use Case 16 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises Assumption: A single Feedback item can be valid for text blocks from multiple submissions.
  17. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich Stage 1: Compare automatic assessment with manual assessments in a simulation. Hypothesis 1: Our automatic Assessment mechanism produce results within accuracy greater than 85% compared with manual assessments. Stage 2: Evaluate automatic assessment workflow in a EIST 2019 lecture. Hypothesis 2: Employing automatic assessment can save more than 50% compared with manual assessments. Evaluation Plan: Compute the assessment accuracy 17 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
  18. Chair for Applied Software Engineering Department of Informatics Technical University

    of Munich ▪ Idea for face-to-face lectures, can be generalized to MOOCs. ▪ Mechanism to reduce manual grading overhead. ▪ Aim at increasing amount and quality of feedback. ▪ Text block based assessment. Similarity based reuse of feedback. ▪ Future work: □ Optimize manual assessment order □ Advanced method for deriving text blocks from submission ▪ Status: Collection of 200 data points. ▪ Next step: Simulation Conclusion & Status 18 Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
  19. 19 Chair for Applied Software Engineering Department of Informatics Technical

    University of Munich Toward the Automatic Assessment of Text Exercises Jan Philip Bernius and Bernd Bruegge 2nd Workshop on Innovative Software Engineering Education 19.02.2019 in Stuttgart, Germany