Creating summative scores from instructionally embedded results

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Accessible Teaching,
Learning, and Assessment Systems (ATLAS) Creating Summative Scores from Instructionally Embedded Results W. Jake Thompson and Brooke Nash

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS What are
instructionally embedded assessments? • One possible through-year assessment design • Short, frequent assessments embedded into teachers’ planned instructional units • Teachers flexibly choose which standards to assess and when, according to local curriculum and pacing guides • No full blueprint end-of-year assessment • Results provided at a grain-size to support instructional decision-making

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Theory-of-action driven
psychometrics • Item response theory is often treated as the default psychometric model • We may define a theory of action for how scores should be used, but then reverse engineer how to get that information (or an approximation) from our IRT model • Rather than starting from the psychometric model, we advocate starting from the intended uses and building the model that meets the assessment's needs

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Psychometrics in
a through-year assessment • Through-year assessments have unique designs with different intended uses and traditional summative assessments • If we are measuring something different, we shouldn't expect our results to look the same • Need psychometric models that meet the needs of a through- year assessment • Two example use cases: • Dynamic Learning Maps • Pathways for Instructionally Embedded Assessment

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Dynamic Learning
Maps

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS The DLM
assessments • An operational instructionally embedded assessment design since 2014–2015 • Alternate assessment for students with significant cognitive disabilities • Provides results for use in state accountability system • Has fully met peer review requirements

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS DLM theory
of action DLM Consortium (2022)

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Diagnostic classification
for DLM • Diagnostic classification models • Highly reliable mastery classifications with 3–5 items • We define the grain size of the skills that are measured • Mastery results that are updated as students complete assessments • Results are relevant for instructional decision-making Thompson & Clark (2024)

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Learning profile
report

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS From profiles
to performance levels • There is no "score"—results are the mastery profile for each student • Technical evidence based on skill-level classifications rather than score precision (e.g., Thompson et al., 2021, 2023) • Summative performance levels are determined by the final profile of mastered skills • Condensed mastery profiles (Clark et al., 2017) for standard setting

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Pathways for
Instructionally Embedded Assessment

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS The PIE
project • CGSA funded partnership with the Missouri Department of Elementary and Secondary Education • Similar approach to DLM, but in the context of general education 5th grade mathematics • Instructionally embedded assessments delivered throughout the year • Teacher choice of how to group standards and when to administer • Learning pathways around each standard to provide information on where students are at, relative to the grade-level content standard

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Summative reporting
for PIE • Used data collected from the PIE pilot study to investigate several possible models as a proof-of-concept • IRT, DCM, Hybrid IRT/DCM • Evaluated against the PIE theory of action ATLAS (2025)

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Scale scores
from a diagnostic assessment • Results are reported as the final profile of skills mastered by the student • Any scale score should be consistent with the profile not necessarily the item response pattern • Solution: Mastery classifications become the indicators in the IRT model • Or mastery probabilities in a Beta-IRT model • Interpretation: Estimate of student ability based on skills that were mastered by the end of the year

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Baseline Midway
End-of-Unit Level 1 Level 2 Level 3 The hybrid model

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Benefits of
the hybrid approach • Through-year results based on diagnostic models to inform instructional decision-making • Scale score determined by each student's unique profile of mastered skills • Generation ≠ reporting of a scale score • Reporting should still be informed by intended uses • Methods for setting standards and growth a determined based on summative results • If using a scale score, can plug into existing methods

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Evaluating summative
models Claim IRT Model Diagnostic Model Hybrid Model I: Mastery results represent what students know and can do relative to the learning pathways. Not supported Results reported directly as the set of mastery KSUs Mastery results directly inform summative scale score K: Summative results accurately reflect student achievement of grade-level academic content standards. Supported with a single scale score Supported with a profile of mastered KSUs Supported with both scale score and diagnostic profile L: Educators make instructional decisions based on data from the PIE assessments. Not well suited to instructional decision-making Instructional decision-making based on mastery profile Instructional decision-making based on mastery profile M: Students make progress towards mastery of grade-level content standards. Supported with existing growth models Additional research needed to evaluate profile-based growth Supported with existing growth models

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Final thoughts
• Our psychometrics must be as flexible and creative as our assessment designs • Think carefully about the results that need to be reported to support an assessment's theory of action and intended uses • May or may not include a "score" • Build the model that supports the reporting of the necessary results—not the other way around

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS wjakethompson.com [email protected]
0000-0001-7339-0300 in/wjakethompson @wjakethompson @wjakethompson.com @[email protected] @wjakethompson Thank you! Slides

Creating summative scores from instructionally ...

Creating summative scores from instructionally embedded results

Jake Thompson

More Decks by Jake Thompson

Other Decks in Education

Featured

Transcript

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Accessible Teaching,

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS What are

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Theory-of-action driven

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Psychometrics in

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Dynamic Learning

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS The DLM

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS DLM theory

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Diagnostic classification

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Learning profile

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS From profiles

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Pathways for

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS The PIE

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Summative reporting

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Scale scores

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Baseline Midway

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Benefits of

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Evaluating summative

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS Final thoughts

The UNIVERSITY of KANSAS The UNIVERSITY of KANSAS wjakethompson.com [email protected]