$30 off During Our Annual Pro Sale. View Details »

Katie Bell - The computer science of marking computer science assignments

Katie Bell - The computer science of marking computer science assignments

When writing systems to test if beginner programmers' code was correct, I didn't expect to need numpy, scipy a custom C module and a whole lot of cool geometry algorithms. Giving actionable feedback on tasks (in this case logo/turtle vector drawings), is necessary for the learning process and goes some fun places. Take this as a case study of writing efficient geometry number crunching in Python.

https://us.pycon.org/2016/schedule/presentation/2247/

PyCon 2016

May 29, 2016
Tweet

More Decks by PyCon 2016

Other Decks in Programming

Transcript

  1. The Computer Science of Marking
    Computer Science Assignments
    Katie Bell
    Software Engineer - Grok Learning

    View Slide

  2. I’ve never heard of this ‘Grok Learning’
    Image: CC0 Public Domain

    View Slide

  3. Background

    View Slide

  4. Let me tell you a story...
    Image: The Princess Bride (1987)

    View Slide

  5. Turtles
    “It’s turtles all the way down!”
    1

    View Slide

  6. Example
    from turtle import *
    left(60)
    forward(100)
    right(120)
    forward(100)
    right(120)
    forward(100)
    left(90)
    forward(100)
    left(90)
    forward(100)
    left(90)
    forward(100)

    View Slide

  7. Example challenges

    View Slide

  8. ○ Simple, graphical, engaging
    ○ Built-in module in Python
    ○ Teachers were asking for it
    ○ Ties in well with maths teaching
    Why Turtle?

    View Slide

  9. True to built-in Python Turtle in the browser
    Frontend
    Sandboxed
    Python
    Browser
    Code and
    stdin Realtime
    animation events
    SVG animation
    Modified
    turtle.py

    View Slide

  10. The marking system
    Because feedback is important.
    2

    View Slide

  11. “You’re missing this line: {diagram}”
    “The angle between these two lines should
    be {angle} but it’s {angle}°.”
    “This line is the wrong colour. It was
    {colour1} but it should be {colour2}.”
    “There’s an extra line here {diagram}, that
    shouldn’t be there.”
    Meaningful feedback goals

    View Slide

  12. 1. Rendered pixel-level differences
    2. Vector comparison
    Marking Options:

    View Slide

  13. Comparing pixel by pixel
    angle = 360/7
    angle = int(360/7)

    View Slide

  14. Vector Comparison
    Down the rabbit hole...
    3

    View Slide

  15. Visually distinguishable differences between
    drawings should matter.
    ○ Additional lines, Missing lines
    ○ Lines of different lengths/positions
    So we have two drawings: Expected and Actual

    View Slide

  16. ○ Fuzzy floating point matching
    ○ Drawing order should not matter.
    ○ Overlapping lines should not matter.
    ○ Translation of the whole drawing should
    not matter
    Additional Requirements:

    View Slide

  17. Connected & Overlapping Lines

    View Slide

  18. Simplifying
    Raw line segments
    from turtle drawing
    Minimised line
    segments

    View Slide

  19. Where we’re at:

    View Slide

  20. Filled areas!
    “It’s complicated.” - Facebook
    4

    View Slide

  21. from turtle import *
    fillcolor('red')
    begin_fill()
    for i in range(3):
    forward(100)
    left(120)
    end_fill()
    Fills with turtle

    View Slide

  22. The fun of fills

    View Slide

  23. The fun of fills

    View Slide

  24. The fun of fills
    White rectangle drawn
    over the top

    View Slide

  25. The fun of fills

    View Slide

  26. “Resolving overlapping polygons is a solved
    problem right? There should be a ready made
    algorithm and library for this.”
    - Me

    View Slide

  27. Convex Simple
    Polygon
    Types of polygons
    Simple Polygon
    Complex Polygon
    (Self intersecting)

    View Slide

  28. Triangulate
    Image Source: http://hyperboleandahalf.blogspot.com/

    View Slide

  29. Delaunay Triangulation
    Image: Public Domain

    View Slide

  30. from scipy.spatial import Delaunay
    Conveniently in Scipy

    View Slide

  31. 1. Resolve shapes to non-overlapping triangles
    2. Work out what colour each triangle is
    3. Stitch the triangles together to form an
    outline of the filled area
    4. Compare the vectors of filled area outlines
    for each colour
    Here’s the plan:

    View Slide

  32. Applying triangulation

    View Slide

  33. Applying triangulation

    View Slide

  34. Applying triangulation

    View Slide

  35. Take the centroid of each triangle.
    Determine the highest (drawing order)
    shape which contains that point.
    Colour each triangle

    View Slide

  36. Testing if a point is in a complex polygon

    View Slide

  37. Testing if a point is in a complex polygon

    View Slide

  38. Testing if a point is in a complex polygon

    View Slide

  39. Testing if a point is in a complex polygon

    View Slide

  40. INTERSECTING LINES
    Trickier than originally thought.

    View Slide

  41. In order for the triangulation to work

    View Slide

  42. In order for the triangulation to work

    View Slide

  43. Resolve all the shapes in one triangulation

    View Slide

  44. Resolve all the shapes in one triangulation

    View Slide

  45. Stitching together

    View Slide

  46. Stitching together

    View Slide

  47. Stitching together

    View Slide

  48. Stitching together

    View Slide

  49. Stitching together

    View Slide

  50. Stitching together

    View Slide

  51. Stitching together

    View Slide

  52. Simplify the vectors and we're done!

    View Slide

  53. Sweet, it works!
    But there's one small problem...
    5

    View Slide

  54. Birthday Banner

    View Slide

  55. 6-8 seconds
    Per test case, for 6 test cases.

    View Slide

  56. Triangulation of Birthday Banner:

    View Slide

  57. $ nosetests test_checker_logo_diff.py --with-profile
    ................................................
    13327123 function calls (13321330 primitive calls) in 22.131 seconds
    Ordered by: cumulative time
    ncalls cumtime percall filename:lineno(function)
    9/1 22.131 22.131 site-packages/nose/suite.py:176(__call__)
    9/1 22.131 22.131 site-packages/nose/suite.py:197(run)
    48 22.129 0.461 site-packages/nose/case.py:44(__call__)
    48 22.129 0.461 site-packages/nose/case.py:115(run)
    48 22.124 0.461 site-packages/nose/case.py:142(runTest)
    48 22.124 0.461 unittest/case.py:394(__call__)
    48 22.124 0.461 unittest/case.py:297(run)
    38 22.091 0.581 checker_logo_diff.py:1094(check)
    146 21.554 0.148 checker_logo_diff.py:925(resolve_fill_triangles)
    90280 20.810 0.000 checker_logo_diff.py:140(is_intersection)
    538466 14.157 0.000 site-packages/numpy/core/numeric.py:2276(isclose)
    467962 13.619 0.000 site-packages/numpy/core/numeric.py:2212(allclose)
    76 11.475 0.151 checker_logo_diff.py:773(collapse_shapes_to_lines)
    Profiling

    View Slide

  58. Attempt 1: Bentley–Ottmann algorithm

    View Slide

  59. Conditions:
    1. No two line segment endpoints or
    crossings have the same x-coordinate
    2. No line segment endpoint lies upon
    another line segment
    3. No three line segments intersect at a
    single point.
    Attempt 1: Bentley–Ottmann algorithm

    View Slide

  60. ○ 2 days of development time
    ○ 84% of test cases passing
    ○ Speed roughly the same as before
    Conclusion: Not working
    Attempt 1: Bentley–Ottmann algorithm

    View Slide

  61. Attempt 2: CGAL
    The Computational Geometry Algorithms Library (C++)

    View Slide

  62. github.com/CGAL/cgal-swig-bindings
    “This project is still experimental and more
    packages will be added.”
    So uh… Python bindings?

    View Slide

  63. But it was too precise...

    View Slide

  64. ○ 305 lines of horrific SWIG and C++ typedefs
    ○ ~1 day of development time
    ○ Test suite even slower
    Conclusion: Not working
    Attempt 2: CGAL

    View Slide

  65. is_intersection

    View Slide

  66. Attempt 3: Write a custom C Module
    def is_intersection(seg1, seg2, check_dir=False, ends=True):
    return fastlogo.fast_is_intersection(
    tuple(seg1[0]),
    tuple(seg1[1]),
    tuple(seg2[0]),
    tuple(seg2[1]),
    check_dir,
    ends)

    View Slide

  67. ~10x speed increase!
    For both birthday banner and the test suite.

    View Slide

  68. Lessons Learned:
    1. Geometry is fun!
    2. Profiling is the best.
    3. If you want more speed, forget algorithms
    and rewrite it in C.

    View Slide

  69. ○ True-to-builtin Python Turtle in the browser
    ○ Good automated feedback to students
    Todo:
    ○ Add yet more detailed feedback
    ○ More algorithmic/speed improvements
    Where we are now:

    View Slide

  70. Questions?
    @notsolonecoder, @groklearning
    [email protected]
    Try a turtle activity (Hour of Code):
    groklearning.com/hoc-2015

    View Slide