Slide 1

Slide 1 text

60th RELC International Conference • 9–11 March 2026 • Singapore RELC International Hotel, Singapore Investigating Changes in Self-Assessed Spoken English Proficiency in a Three-Week Study-Abroad Program Ken Urano Hokkai-Gakuen University, Japan [email protected]

Slide 2

Slide 2 text

BACKGROUND Short-Term Study Abroad Research Meta-Analysis • Hirai (2018) • Program effects vary by duration; short-term gains are limited Example Study • Suzuki & Hayashi (2014) • Pre–post gains in proficiency and self-assessed speaking; no comparison group Methodological Issue • Many studies use pre–post only • Comparison groups are rare Present study: pre–post design with a comparison group to examine changes in perceived spoken English proficiency during a three-week study-abroad program 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 2 / 17

Slide 3

Slide 3 text

RATIONALE Why Include a Comparison Group? "Did the two groups change differently?" 1 Ruling Out General Development Students may improve due to independent study, practice, or general maturation — not program exposure. 2 Isolating Program Effects Comparing trajectories helps attribute change to participation in the overseas program rather than time alone. 3 Focus: Group × Time Interaction The key parameter is whether the rate of change differs between groups — not merely whether scores increased. 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 3 / 17

Slide 4

Slide 4 text

PROGRAM Program Overview Duration: 3 Weeks 01 Intensive English Course Communicative skills focus; structured input and output activities in an English-medium classroom setting. 02 EBP Company Visits English for Business Purposes (EBP) framework; students prepared and delivered presentations to company staff. 03 Homestay Immersion Daily English use with host families; authentic communicative situations beyond classroom boundaries. Key Emphasis: Output-Oriented — students were required to speak, present, and interact in real time 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 4 / 17

Slide 5

Slide 5 text

METHODS Participants & Design n = 12 Study Abroad Group Three-week overseas program participants n = 9 Control Group Japan-stay students; same time period Design: 2 × 2 Mixed ANOVA Factor Type Levels Group Between-subjects Study Abroad / Control Time Within-subjects Pre / Post Primary parameter — Group × Time interaction effect 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 5 / 17

Slide 6

Slide 6 text

METHODS Understanding Group × Time Interaction 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 6 / 17

Slide 7

Slide 7 text

INSTRUMENT CEFR and CEFR-J Alignment CEFR-J provides finer-grained subdivisions of CEFR levels, enabling more sensitive measurement of incremental development among Japanese learners (Council of Europe, 2001, 2020; Negishi et al., 2013). 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 7 / 17

Slide 8

Slide 8 text

INSTRUMENT Measuring Spoken Proficiency Target domains → Listening | Interaction | Production CEFR-J covers 5 domains (Listening, Reading, Spoken Interaction, Spoken Production, Writing) (Negishi et al., 2013). Rating Procedure Descriptor Structure Administration • 5-point Likert scale per descriptor • 1 = cannot perform → 5 = fully able • Higher values = stronger perceived ability • Two CEFR-J descriptors per level (Pre-A1 to B2) • C1 and C2: one descriptor each • Level score = mean of the two ratings • Identical instrument for all participants • Administered pre- and post- program • Both groups completed the same assessment 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 8 / 17

Slide 9

Slide 9 text

INSTRUMENT Examples of CEFR-J Descriptors Listening (A1.2) I can understand short conversations about familiar topics (e.g., hobbies, sports, club activities), provided they are delivered in slow and clear speech. Interaction (A2.2) I can interact in predictable everyday situations (e.g., a post office, a station, a shop), using a wide range of words and expressions. Production (B2.1) I can develop an argument clearly in a debate by providing evidence, provided the topic is of personal interest. English descriptors shown above: Negishi et al. (2013). 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 9 / 17

Slide 10

Slide 10 text

ANALYSIS Analytical Approach and Weighting Weighting Procedure • CEFR-J level order (Pre-A1 = 1, A1.1 = 2, A1.2 = 3, … A2.1 = 5, … C2 = 13) used as weighting basis • Adjacent levels treated as equally spaced intervals; higher levels receive proportionally larger weights • Two descriptor ratings per level averaged, then multiplied by that level’s weight • Weighted scores summed across all levels to produce one domain score per participant Worked Example: A2.1 descriptors rated 3 and 4 → mean = (3 + 4) / 2 = 3.5 A2.1 is the 5th level in the CEFR-J scale → weight = 5 → contribution to domain score = 3.5 × 5 = 17.5 Domain score = sum of all such weighted contributions (Pre-A1 through C2) 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 10 / 17

Slide 11

Slide 11 text

ANALYSIS Effect Size: Formula and Benchmarks Cohen's d for Interaction Effect (Group × Time) d = (M SA, Post − M SA, Pre) − (M CG, Post − M CG, Pre) SD pooled of (Post − Pre)SA and (Post − Pre)CG Benchmark Criteria for Interpreting d Small Medium Large Cohen (1988) 0.20 0.50 0.80 Plonsky & Oswald (2014) 0.40 0.70 1.00 This study applies Plonsky & Oswald (2014) benchmarks, which are calibrated for L2 research contexts. 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 11 / 17

Slide 12

Slide 12 text

RESULTS Descriptive Patterns (Study Abroad Group Only) Upward movement observed across all three domains — but pre–post improvement alone does not fully address our central question. 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 12 / 17

Slide 13

Slide 13 text

RESULTS Including the Control Group Production shows the clearest divergence — effects appear domain-sensitive, not uniform across all spoken proficiency domains. 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 13 / 17

Slide 14

Slide 14 text

RESULTS Effect Sizes (Group × Time Interaction) 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 14 / 17

Slide 15

Slide 15 text

DISCUSSION Discussion & Implications Measurable Short-Term Gains Even within a brief three-week program, domain-specific perceived development can occur. Domain-Sensitive Effects The effect is not uniform across all skills. Divergence was strongest in Production, more modest in Listening, and smallest in Interaction. The Importance of Evaluation Design Without a comparison group, all domains appeared to improve equally. Adding a comparison group differentiated general development from program-related impact. Evaluating relative change across groups yields more informative insights than standard pre–post designs. 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 15 / 17

Slide 16

Slide 16 text

Summary Objective To examine self-assessed spoken English proficiency changes in a short-term study-abroad program. Main Findings Gains across all three domains; Production largest; Interaction and Listening smaller and less robust. Implications Short-term gains are possible but domain-specific; comparison group design aids accurate interpretation. Ken Urano Hokkai-Gakuen University • RELC 2026 • [email protected]

Slide 17

Slide 17 text

References Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum. Council of Europe. (2001). Common European framework of reference for languages. Cambridge University Press. Council of Europe. (2020). Common European framework of reference for languages: Companion volume. Council of Europe Publishing. Hirai, A. (2018). The effects of study abroad duration and predeparture proficiency on the L2 proficiency of Japanese university students: A meta-analysis approach. JLTA Journal, 21, 102–123. https://doi.org/10.20622/jltaj.21.0_102 Negishi, M., Takada, T., & Tono, Y. (2013). A progress report on the development of the CEFR-J. In E. D. Galaczi & C. J. Weir (Eds.), Exploring language frameworks (pp. 135–163). Cambridge University Press. Plonsky, L., & Oswald, F. L. (2014). How big is big? Interpreting effect sizes in L2 research. Language Learning, 64(4), 878–912. https://doi.org/10.1111/lang.12079 Suzuki, R., & Hayashi, C. (2014). Kaigai gogaku tanki ryugaku no kouka [The effects of short-term study abroad programmes on students' English proficiency and affective variables]. KATE Journal, 28, 83–96. https://doi.org/10.20806/katejournal.28.0_83 60th RELC International Conference • 9–11 March 2026 • Singapore | Urano (2026) — Hokkai-Gakuen University 17 / 17