Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Using Diagnostic Models to Evaluate Student Learning Hierarchies in a Large-Scale Assessment

Using Diagnostic Models to Evaluate Student Learning Hierarchies in a Large-Scale Assessment

In recent years, the focus of large-scale assessments has begun to shift from summative evaluations of student achievement to more actionable and fine-grained information about student proficiency of skills within cognitive learning models. This change in emphasis not only has benefits for teachers, students, and parents, but also offers opportunities for researchers to better understand student learning and skill acquisition. To meet these challenges and opportunities, we need innovative psychometric models that we can use for reporting valid results and understanding student cognition.

In this presentation, we describe a framework for using diagnostic classification models (DCMs; also called cognitive diagnostic models) within an operational assessment program to measure student proficiency and evaluate hierarchies in skill acquisition. The diagnostic framework for evaluating hierarchies includes multiple methods and applications of DCMs, each providing complementary information for evaluating the theoretically implied skill hierarchy and for identifying potential causes of violations to the hierarchy. We then illustrate the diagnostic framework in practice using data from a large-scale operational assessment, the Dynamic Learning Maps® (DLM®) Alternate Assessment System. Results for DLM assessments are calculated using DCMs to provide instructionally informative results within the proposed cognitive learning model structure. Thus, DCMs are used for both scoring and evaluating skill hierarchies within the DLM assessments.

The findings demonstrate we can use DCMs for providing actionable assessment results and evaluating skill hierarchies to better understand student learning. Using the proposed DCM framework, we show that DLM skills largely conform to the hierarchy implied by the assessment design. Finally, we discuss how we can use findings to inform revisions to skill hierarchies and future avenues of research.

Jake Thompson

September 07, 2023
Tweet

More Decks by Jake Thompson

Other Decks in Education

Transcript

  1. Using Diagnostic Models to Evaluate
    Student Learning Hierarchies in a Large-
    Scale Assessment
    W. Jake Thompson, Brooke Nash, & Jeffrey C. Hoover
    Accessible Teaching, Learning, and Assessment Systems (ATLAS)
    University of Kansas

    View Slide

  2. Diagnostic Classification Models
    • DCMs (or CDMs) are confirmatory latent class models
    to probabilistically place students into profiles of
    mastered skills (called attributes)
    • Facilitate fine-grained reporting of skill mastery to
    support instructional decision-making
    • Enable the examination of skill dependencies or
    hierarchies

    View Slide

  3. Benefits of DCMs
    • Because the goal is classification, we can get reliable
    results with fewer items than a comparable scale-
    score assessment
    – Summarize mastery across skills (Thompson et al., 2019)
    – Longitudinal extensions (e.g., Madison & Bradshaw, 2018)
    • DCMs allow us to both provide instructionally
    relevant assessment results and answer research
    questions about student learning

    View Slide

  4. DCMs in Practice
    • Despite benefits, DCMs are not widely used in
    applied or operational settings
    – Lack of training and tools
    – Constraints on innovation from (U.S.) regulations and
    guidelines (e.g., Standards; AERA et al., 2014)
    • One application for today's discussion: Dynamic
    Learning Maps (DLM) Alternate Assessment System

    View Slide

  5. Dynamic Learning Maps
    • 3–12 alternate assessment for students with
    significant cognitive disabilities
    – Assessments in English language arts, mathematics, and
    science
    • Academic content available at multiple levels of
    complexity for each standard
    • Results are reported as a profile of mastered skills

    View Slide

  6. Example Science Results
    Student’s performance in high school science Essential Elements is summarized below. This information is based on all of the D
    tests Student took during Spring 2023. Student was assessed on 9 out of 9 Essential Elements and 3 out of 3 Domains expecte
    high school science.
    Demonstrating mastery of a Level during the assessment assumes mastery of all prior Levels in the Essential Element. This ta
    describes what skills your child demonstrated in the assessment and how those skills compare to grade level expectations.
    Estimated Mastery Level
    Essential Element 1 2 3 (Target)
    SCI.EE.HS.PS1-2
    Recognize a change during a
    chemical reaction
    Identify changes during a chemical
    reaction
    Use evidence to explain patterns in
    chemical properties
    SCI.EE.HS.PS2-3
    Identify safety devices that lessen
    force
    Use data to compare the e ect of
    safety devices
    Evaluate safety devices and minimize
    force
    SCI.EE.HS.PS3-4
    Compare the temperatures of two
    liquids
    Compare the temperatures of liquids
    before and after mixing
    Investigate and predict the
    temperatures of liquids before and
    after mixing
    SCI.EE.HS.LS1-2
    Recognize that organs have di erent
    functions
    Identify which organs have a specific
    function
    Model the organization and interaction
    of organs
    SCI.EE.HS.LS2-2
    Identify food and shelter needs for
    wildlife
    Recognize the relationship between
    population size and resources
    Explain the dependence of an animal
    population on other organisms
    SCI.EE.HS.LS4-2 Match species to their environments
    Identify factors that require special
    traits to survive
    Explain how traits allow a species to
    survive
    Levels mastered this year No evidence of mastery on this Essential Element Essential Element not tested

    View Slide

  7. Example DLM Hierarchy
    • Investigate and predict the change in motion of
    objects based on the forces acting on those objects
    Identify ways to
    change motion.
    Investigate and
    identify ways to
    change motion.
    Investigate and
    predict changes
    in motion.
    Level 1 Level 2 Level 3

    View Slide

  8. Skill Hierarchies in DCMs

    View Slide

  9. Evaluating Skill Hierarchies With DCMs
    • Thompson & Nash (2022): A
    diagnostic framework for the
    empirical evaluation of
    learning maps
    • Three methods for evaluating
    skill hierarchies using DCMs
    • Descriptions and examples
    using Dynamic Learning Maps

    View Slide

  10. Method 1: Patterns of Mastery Profiles
    • Estimate two models
    – LCDM (Henson et al., 2009):
    Saturated; all possible profiles
    – HDCM (Templin & Bradshaw,
    2014): Constrained; only
    hypothesized profiles
    • Evaluate model fit
    – Posterior predictive model
    checks
    – Model comparisons
    Level 1 Level 2 Level 3
    0 0 0
    1 0 0
    0 1 0
    0 0 1
    1 1 0
    1 0 1
    0 1 1
    1 1 1

    View Slide

  11. Method 1: Flagging Criteria
    • Model fit evaluated using posterior predictive checks
    of the raw score distribution (Thompson, 2019)
    • Sufficient fit for the HDCM (constrained model)
    indicates support for the hierarchy
    • Flags when the LCDM shows sufficient model fit and
    the HDCM does not
    – Indicates the unexpected classes are needed to fully
    represent the data

    View Slide

  12. Method 1: Example Output

    View Slide

  13. Method 1: Limitations
    • The number of profiles in the saturated model
    (LCDM) increases exponentially with the number of
    attributes
    • Need students in all profiles to get reliable
    parameter estimates
    • Extremely small classes can cause estimation
    problems (Templin & Bradshaw, 2014)

    View Slide

  14. Method 2: Patterns of Attribute Mastery
    • Estimate each skill as a
    separate 1-attribute DCM
    • Make mastery
    determinations for each
    assessed skills
    • Look for unexpected
    patterns in attribute
    mastery
    Student Level 1 Level 2 Level 3
    1 1 1 —
    2 1 0 0
    3 — 1 1
    4 1 0 1
    5 0 0 0
    6 1 — —
    … … … …

    View Slide

  15. Method 2: Flagging Criteria
    • Multiple flagging methods
    – Total number of students with unexpected pattern across
    assessed attributes
    – Total number of reversals for a specific pair of attributes
    • Actual thresholds for flagging depend on
    characteristics of the assessment
    – Simulation studies to determine an expected number of
    reversals

    View Slide

  16. Method 2: Example Output
    Overall, 10% of students had an
    unexpected pattern (Threshold: 14%)
    30% of students testing
    on Levels 2 and 3 had an
    unexpected pattern
    (Threshold: 24%)

    View Slide

  17. Method 2: Limitations
    • No direct test of relationships between attributes
    • Number of models to estimate increases with the
    number of attributes
    • Additional analyses needed to set reasonable
    thresholds for flagging unexpected patterns

    View Slide

  18. Method 3: Patterns of Attribute Difficulty
    • Group students into cohorts
    • Measure attribute difficulty
    using average p-values of
    items for each cohort
    • Within each cohort, p-
    values should decrease at
    higher hierarchy levels

    View Slide

  19. Method 3: Flagging Criteria
    • Calculate Cohen’s h effect size for the difference in
    p-values
    – Subtract lower level from higher level (e.g., Level 3 −
    Level 2)
    – Difference should be negative (i.e., Level 3 should have a
    lower p-value)
    • Flag instances where Cohen’s h ≥ 0.2
    – Moderate or larger effect in unexpected direction

    View Slide

  20. Method 3: Example Output

    View Slide

  21. Method 3: Limitations
    • Not a DCM-based model
    – p-values as a proxy for attribute difficulty
    • No direct test of attribute relationships
    • Potential violations of the hierarchy must be inferred
    from patterns of flags across student cohorts

    View Slide

  22. Revisiting Our Hierarchy
    • Results indicated that students could be proficient
    on Level 3 without being proficient on Level 2
    Identify ways
    to change
    motion.
    Investigate
    and identify
    ways to
    change
    motion.
    Investigate
    and predict
    changes in
    motion.

    View Slide

  23. Alternate Skill Hierarchy
    • Multiple pathways to Level 3 proficiency
    Identify ways
    to change
    motion.
    Investigate
    and identify
    ways to
    change
    motion.
    Investigate
    and predict
    changes in
    motion.

    View Slide

  24. Summary
    • DCM framework provides multiple methods for
    analyzing attribute hierarchies with DCMs
    – Complementary strengths and weakness
    – Can be adapted to non-linear relationships
    • DCMs are a powerful tool for understanding student
    learning and providing instructionally relevant results
    for students

    View Slide

  25. Get in Touch!
    atlas.ku.edu
    [email protected]
    company/atlas-ku
    / @atlas4learning
    https://dynamiclearningmaps.org/
    wjakethompson.com
    [email protected]
    in/wjakethompson
    / / / @wjakethompson

    View Slide

  26. 26
    American Educational Research Association (AERA), American Psychological Association, & National Council on
    Measurement in Education. (2014). Standards for educational and psychological testing. American Educational
    Research Association. https://www.testingstandards.net/open-access-files.html
    Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear
    models with latent variables. Psychometrika, 74(2), 191–210. https://doi.org/10.1007/s11336-008-9089-5
    Madison, M. J., & Bradshaw, L. P. (2018). Assessing growth in a diagnostic classification model framework.
    Psychometrika, 83(4), 963-990. https://doi.org/10.1007/s11336-018-9638-5
    Templin, J., & Bradshaw, L. (2014). Hierarchical diagnostic classification models: A family of models for estimating and
    testing attribute hierarchies. Psychometrika, 79(2), 317–339. https://doi.org/10.1007/s11336-013-9362-0
    Thompson, W. J. (2019). Bayesian psychometrics for diagnostic assessments: A proof of concept (Research Report No.
    19-01). University of Kansas; Accessible Teaching, Learning, and Assessment Systems.
    https://doi.org/10.35542/osf.io/jzqs8
    Thompson, W. J., Clark, A. K., & Nash, B. (2019). Measuring the reliability of diagnostic mastery classifications at
    multiple levels of reporting. Applied Measurement in Education, 32(4), 298–309.
    https://doi.org/10.1080/08957347.2019.1660345 [Preprint]
    Thompson, W. J. & Nash, B. (2022). A diagnostic framework for the empirical evaluation of learning maps. Frontiers in
    Education, 6, 714736. https://doi.org/10.3389/feduc.2021.714736
    References

    View Slide