$30 off During Our Annual Pro Sale. View Details »

Toward the Automatic Assessment of Text Exercises

Toward the Automatic Assessment of Text Exercises

Presentation of my paper "Toward the Automatic Assessment of Text Exercises" at the 2nd Workshop on Innovative Software Engineering Education in Stuttgart, Germany

Paper: http://ceur-ws.org/Vol-2308/isee2019paper04.pdf

Abstract:
Exercises are an essential part of learning. Manual assessment of exercises requires efforts from instructors and can also lead to quality problems and inconsistencies between assessments. Especially with growing student populations, this also leads to delayed grading, and it is more and more difficult to provide individual feedback.

The goal is to provide timely responses to homework submissions in large classes. By reducing the required efforts for assessments, instructors can invest more time in supporting students and providing individual feedback.

This paper argues that automated assessment provides more individual feedback for students, combined with quicker feedback and grading cycles. We introduce a concept for automatic assessment of text exercises using machine learning techniques. Also, we describe our plans to use this concept in a case study with 1900 students.

Jan Philip Bernius, M.Sc.

February 19, 2019
Tweet

Other Decks in Research

Transcript

  1. 1
    Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    Toward the Automatic Assessment of Text Exercises
    Jan Philip Bernius and Bernd Bruegge
    2nd Workshop on Innovative Software Engineering Education
    19.02.2019 in Stuttgart, Germany

    View Slide

  2. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    363
    496
    879
    747
    785
    1.043
    1.142
    1.431
    1.625
    1.950
    0
    500
    1000
    1500
    2000
    2500
    2010
    2011
    2012
    2013
    2014
    2015
    2016
    2017
    2018
    2019
    2020
    2021
    ■ Participants in the lecture "Introduction to
    Software Engineering“ (EIST) has risen sharply.
    ■ Focus on face-to-face teaching with weekly
    homework exercises.
    ■ Summer semester 2019: 1950 students enrolled.
    Student population grew by factor five
    2
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises

    View Slide

  3. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    Submits
    Solution
    Student Instructor
    Assesses
    Solution
    manually
    Receives
    Assessment
    Gets
    Homework
    Solves
    Homework
    Grading &
    Feedback
    Traditional Homework Workflow
    3
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises

    View Slide

  4. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    ■ In addition to traditional homework, EIST also offers up to four interactive in-class exercises
    per lecture [3].
    ■ Expected grading load for summer 2019 with 1950 students in 13 lectures:
    □ each with 4 interactive exercises à 4 × 13 × 1950 ≈ 100.000
    □ 13 homework exercises à 3 × 13 × 1950 ≈ 75.000
    ▷ EIST Instructors need to assess ≈ 175.000 exercises
    30 Tutors need to grade 450 exercises every week. What are effects on grading quality?
    ■ Goal: Automate the grading
    Grading Nightmare
    4
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
    [3] S. Krusche, A. Seitz, J. Börstler, and B. Bruegge, “Interactive Learning: Increasing Student Participation through Shorter Exercise Cycles,”
    in 19th Australasian Computing Education Conf. ACM, 2017, pp. 17–26.

    View Slide

  5. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    Assessment of Interactive Exercises
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
    Allow students to refine their solution in the spirit of interactive learning [1].
    5
    Submit
    solution
    Student ArTEMiS
    no
    yes
    Automatic assessment
    possible?
    Assess
    automatically
    Train Assessment
    Model
    Review
    assessment
    yes
    no
    Refine
    solution
    Satisfied?
    «affects»
    Submission
    Assessment
    Calculate
    Total Score
    Assessment
    Model
    Automatic
    Feedback
    Submit
    solution
    Student ArTEMiS Instructor
    no
    yes
    Automatic assessment
    possible? Assess
    manually
    Manual
    Feedback
    Assess
    automatically
    Train Assessment
    Model
    Review
    assessment
    no
    Refine
    solution
    Satisfied?
    «affects»
    Submission
    Assessment
    Calculate
    Total Score
    Assessment
    Model
    Automatic
    Feedback
    [1] S. Krusche and A. Seitz, “Increasing the Interactivity in Software Engineering MOOCs - A Case Study,” in 31th Conference on Software Engineering
    Education and Training, 2019.

    View Slide

  6. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    Submit
    solution
    Student ArTEMiS Instructor
    no
    yes
    Automatic assessment
    possible? Assess
    manually
    Manual
    Feedback
    Assess
    automatically
    Train Assessment
    Model
    Review
    assessment
    yes
    no
    Refine
    solution
    Satisfied?
    «affects»
    Submission
    Assessment
    Calculate
    Total Score
    Assessment
    Model
    Automatic
    Feedback
    Toward the Automatic Assessment of Interactive Exercises
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
    ArTEMiS: An Automatic Assessment Management System for Interactive Learning [7]
    6
    [7] S. Krusche and A. Seitz, “ArTEMiS - An Automatic Assessment Management System for Interactive Learning,” in 49th Technical Symposium on
    Computer Science Education. ACM, 2018.
    (Semi-)Automatic assessment of
    modeling- and programming exercises

    View Slide

  7. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    ■ In addition to traditional homework, EIST also offers up to four interactive in-class exercises
    per lecture [3].
    ■ Expected grading load for summer 2019 with 1950 students in 13 lectures:
    □ each with 4 interactive exercises à 4 × 13 × 1950 ≈ 100.000
    □ 13 homework exercises à 3 × 13 × 1950 ≈ 75.000
    ▷ EIST Instructors need to assess ≈ 175.000 exercises
    30 Tutors need to grade 450 exercises every week. What are effects on grading quality?
    ■ Goal: Automate the grading
    Grading Nightmare
    7
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
    [3] S. Krusche, A. Seitz, J. Börstler, and B. Bruegge, “Interactive Learning: Increasing Student Participation through Shorter Exercise Cycles,”
    in 19th Australasian Computing Education Conf. ACM, 2017, pp. 17–26.

    View Slide

  8. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    Submit
    solution
    Student ArTEMiS Instructor
    no
    yes
    Automatic assessment
    possible? Assess
    manually
    Manual
    Feedback
    Assess
    automatically
    Train Assessment
    Model
    Review
    assessment
    yes
    no
    Refine
    solution
    Satisfied?
    «affects»
    Submission
    Assessment
    Calculate
    Total Score
    Assessment
    Model
    Automatic
    Feedback
    Toward the
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
    ArTEMiS: An Automatic Assessment Management System for Interactive Learning [7]
    8
    [7] S. Krusche and A. Seitz, “ArTEMiS - An Automatic Assessment Management System for Interactive Learning,” in 49th Technical Symposium on
    Computer Science Education. ACM, 2018.
    Submit
    solution
    Student ArTEMiS Instructor
    no
    yes
    Automatic assessment
    possible?
    Assess
    manually
    Manual
    Feedback
    Assess
    automatically
    Train Assessment
    Model
    Review
    assessment
    yes
    no
    Refine
    solution
    Satisfied?
    «affects»
    Submission
    Assessment
    Calculate
    Total Score
    Assessment
    Model
    Automatic
    Feedback
    ?
    Automatic Assessment of Text Exercises

    View Slide

  9. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    Research Question:
    Can assessments of textual exercises be automated?
    Hypothesis:
    Assessments can be automated using a machine learning based approach.
    Core Idea:
    9
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
    Automatic Assessment of Text Exercises

    View Slide

  10. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    ■ Submission consists of multiple text blocks
    ■ Instructors assess textual Submissions using
    multiple Text Blocks.
    ■ Instructors can define Text Blocks themselves.
    ■ Focus on lower spectrum of the revised
    Bloom’s Taxonomy.
    Exercise: Strategy pattern vs. Bridge Pattern
    Problem Statement: Explain the difference between
    the bridge pattern and the strategy pattern.
    Score: 2 / 6
    Reviewer: Jan Philip Bernius
    The bridge pattern in meant to decouple an
    abstraction from is implementation.
    Both patterns are structural patterns.
    The strategy pattern is a structural pattern
    and allows providing multiple algorithms at
    compile time.
    Student Submission: Assessments:
    Assess
    Manual Assessment
    10
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
    [14] D. Krathwohl, “A revision of bloom’s taxonomy: An overview,” Theory into Practice, vol. 41, no. 4, pp. 212–218, 2002.

    View Slide

  11. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    ■ Submission consists of multiple text blocks
    ■ Instructors assess textual Submissions using
    multiple Text Blocks.
    ■ Instructors can define Text Blocks themselves.
    ■ Focus on lower spectrum of the revised
    Bloom’s Taxonomy.
    Exercise: Strategy pattern vs. Bridge Pattern
    Problem Statement: Explain the difference between
    the bridge pattern and the strategy pattern.
    Score: 2 / 6
    Reviewer: Jan Philip Bernius
    Score for "
    "
    Score:
    Feedback:
    The bridge pattern in meant to decouple an
    abstraction from is implementation.
    Assess
    Both patterns are structural patterns.
    The strategy pattern is a structural pattern
    and allows providing multiple algorithms at
    compile time.
    The bridge pattern in meant to decouple
    an abstraction from is implementation.
    2
    Correct
    Score for "
    "
    Score:
    Feedback:
    The strategy pattern in an structural
    pattern and allows providing multiple algorithms at
    0
    The strategy patterns is a
    behavioral pattern. It is
    used to select an algorithm
    at runtime.
    compile time.
    Student Submission: Assessments:
    Manual Assessment
    11
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
    [14] D. Krathwohl, “A revision of bloom’s taxonomy: An overview,” Theory into Practice, vol. 41, no. 4, pp. 212–218, 2002.

    View Slide

  12. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    Submit
    solution
    Student ArTEMiS Instructor
    no
    yes
    Automatic assessment
    possible? Assess
    manually
    Manual
    Feedback
    Assess
    automatically
    Train Assessment
    Model
    Review
    assessment
    yes
    no
    Refine
    solution
    Satisfied?
    «affects»
    Submission
    Assessment
    Calculate
    Total Score
    Assessment
    Model
    Automatic
    Feedback
    Assessment Workflow
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises 12

    View Slide

  13. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    Text Exercise
    problemStatement
    participate()
    Submission
    solution
    submit()
    TextBlock
    phrase
    Student
    sampleSolution
    Event Flow of Assess Automatically
    Use Case
    13
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
    Assumption:
    A single Feedback item can be valid for text blocks from multiple submissions.
    Split Submission
    into Text Blocks Text Block
    Text Block
    Text Block

    View Slide

  14. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    Split Submission
    into Text Blocks
    Calculate Vector
    Representation
    Text
    Block
    Vector
    Representation
    Text Block
    Text Block
    Text Block
    Event Flow of Assess Automatically
    Use Case
    14
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
    Assumption:
    A single Feedback item can be valid for text blocks from multiple submissions.
    Text Exercise
    problemStatement
    participate()
    Submission
    solution
    submit()
    TextBlock
    phrase
    Student VectorRepresentation
    sampleSolution

    View Slide

  15. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich

    Text Exercise
    problemStatement
    participate()

    Submission
    solution
    submit()
    TextBlock
    phrase
    Student

    SimilarityCluster
    VectorRepresentation

    AssessmentModel
    sampleSolution
    Event Flow of Assess Automatically
    Use Case
    15
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
    Assumption:
    A single Feedback item can be valid for text blocks from multiple submissions.
    Split Submission
    into Text Blocks
    Calculate Vector
    Representation
    Text
    Block
    Vector
    Representation
    Find Similarity
    Cluster of Text
    Blocks
    Similarity Cluster
    Text Block
    Text Block
    Text Block

    View Slide

  16. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    Text Exercise
    problemStatement
    participate()
    0..1
    Submission
    solution
    submit()
    TextBlock
    phrase
    0..1
    Feedback
    score
    comment
    Manual
    Feedback
    provide()
    Assessment
    score
    Student
    Instructor Automatic
    Feedback
    confidence
    SimilarityCluster
    VectorRepresentation
    AssessmentModel
    sampleSolution
    Split Submission
    into Text Blocks
    Calculate Vector
    Representation
    Text
    Block
    Vector
    Representation
    Find Similarity
    Cluster of Text
    Blocks
    Similarity Cluster
    Find existing
    Feedback in
    Similarity Cluster
    Feedback
    Text Block
    Text Block
    Text Block
    Event Flow of Assess Automatically
    Use Case
    16
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises
    Assumption:
    A single Feedback item can be valid for text blocks from multiple submissions.

    View Slide

  17. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    Stage 1: Compare automatic assessment with manual assessments in a simulation.
    Hypothesis 1: Our automatic Assessment mechanism produce results within accuracy greater than 85%
    compared with manual assessments.
    Stage 2: Evaluate automatic assessment workflow in a EIST 2019 lecture.
    Hypothesis 2: Employing automatic assessment can save more than 50% compared with manual
    assessments.
    Evaluation Plan: Compute the assessment accuracy
    17
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises

    View Slide

  18. Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    ■ Idea for face-to-face lectures, can be generalized to MOOCs.
    ■ Mechanism to reduce manual grading overhead.
    ■ Aim at increasing amount and quality of feedback.
    ■ Text block based assessment. Similarity based reuse of feedback.
    ■ Future work:
    □ Optimize manual assessment order
    □ Advanced method for deriving text blocks from submission
    ■ Status: Collection of 200 data points.
    ■ Next step: Simulation
    Conclusion & Status
    18
    Jan Philip Bernius | ISEE2019 | Toward the Automatic Assessment of Text Exercises

    View Slide

  19. 19
    Chair for Applied Software Engineering
    Department of Informatics
    Technical University of Munich
    Toward the Automatic Assessment of Text Exercises
    Jan Philip Bernius and Bernd Bruegge
    2nd Workshop on Innovative Software Engineering Education
    19.02.2019 in Stuttgart, Germany

    View Slide