Categories
Programme Evaluation Secondary School Education

Teaching Quality in STEM Enrichment Programs: Gifted Students’ Perceptions of In-School and Out-of-School Learning Environments

Jaggy et al. (2025) investigate how gifted students perceive teaching quality in specialized STEM enrichment programs compared with their regular school classrooms. Previous research has shown that high-ability students often experience learning environments differently from their peers, yet little is known about how participation in extracurricular enrichment programs influences students’ evaluation of teaching quality in both in-school and out-of-school settings. To address this gap, the study examines students attending the Hector Seminar, a specialized STEM enrichment program for gifted secondary school students in Germany and compares their perceptions of teaching quality across learning contexts.

The study uses cross-sectional data from a large-scale talent development project including academically advanced sixth- and seventh-grade students in the German state of Baden-Württemberg. Teaching quality was assessed using student reports on six indicators derived from the three-basic-dimensions model of instructional quality: effective classroom management, cognitive activation, student support, adaptivity, interestingness, and motivational climate. Two research questions guided the analysis: whether gifted students evaluate teaching quality in the enrichment program higher than in regular school classes, and whether students attending the program perceive regular classroom teaching differently from comparable students who do not participate in the program.

Results show that students attending the specialized STEM enrichment program rated teaching quality significantly higher in the program than in their regular school classes across most indicators, particularly in interestingness, motivational climate, and adaptivity. These findings suggest that enrichment programs provide highly stimulating and supportive learning environments for gifted students. Importantly, however, participation in the enrichment program did not lead students to evaluate their regular school teaching more negatively compared with non-participants, indicating that potential reference effects between learning contexts were limited.

Overall, the study highlights the importance of specialized enrichment programs as high-quality learning environments for gifted students while showing that such programs do not necessarily undermine students’ perceptions of regular classroom instruction. The findings contribute to research on gifted education and teaching quality by demonstrating that macro-level adaptations, such as structured STEM enrichment programs, can enhance learning experiences without producing negative comparison effects. The results also underscore the need for teacher professional development and more individualized instruction to better support diverse student needs across learning settings.

 

Source (Open Access): Jaggy, A. K., Wagner, W., Fütterer, T., Göllner, R., & Trautwein, U. (2025). Teaching quality in STEM education: Differences between in-and out-of-school contexts from the perspective of gifted students. International Journal of STEM Education12(1), 53.

https://doi.org/10.1186/s40594-025-00576-wRead the rest

Categories
K-12 Education Social and Motivational Outcomes

The Gap Between Teachers’ Self-Efficacy, Management Strategies, and Actual Classroom Management Behaviors

Shi and colleagues combined questionnaire survey methods with AI-supported classroom behavior analysis to examine the relationships among teachers’ classroom management self-efficacy, self-reported classroom management strategies, and actual classroom management behaviors observed by AI. The study involved 345 Chinese K-12 in-service teachers, collected questionnaire data on their classroom management self-efficacy and strategies, and analyzed 673 valid classroom video recordings, totaling 461.74 hours. The research team developed an AI-supported multimodal classroom management behavior analysis tool that automatically identified teachers’ praise statements, criticism statements, discipline-related statements, positive tone of voice, proportion of proximity to students, and proportion of visual attention to students through text, audio, and image data. This allowed the researchers to test whether teachers’ beliefs, reported strategies, and actual behaviors were consistent.

The results showed that teachers with higher classroom management self-efficacy reported using all types of classroom management strategies more frequently, and these differences were all statistically significant. For example, in praise strategies, the high self-efficacy group had a mean rank of 214.43, significantly higher than the low self-efficacy group’s 128.84 (Z = –8.74, p < .001). In corrective feedback strategies, the high group had a mean rank of 206.27, compared with 137.54 in the low group (Z = –6.67, p < .001). In preventive management strategies, the high group had a mean rank of 221.84, whereas the low group had 120.95 (Z = –9.81, p < .001). In commands/transition strategies, the high group had a mean rank of 220.69, compared with 122.17 in the low group (Z = –9.68, p < .001). However, when actual classroom videos were analyzed through AI, no significant differences emerged between the high and low self-efficacy groups on most observable classroom management behaviors. The only finding was a marginal tendency for teachers with lower self-efficacy to use more discipline-related statements (p = .07), suggesting that they may rely more on disciplinary language to maintain order.

Regarding the relationship between self-reported strategies and AI-observed behaviors, the results showed that consistency existed only in some domains. Teachers’ self-reported praise strategies were significantly positively related to AI-detected praise statements and positive tone of voice, although the effect sizes were relatively small. In contrast, corrective feedback and preventive management strategies were not significantly associated with their corresponding AI-based behavioral indicators. Notably, self-reported commands/transition strategies were significantly negatively related to AI-observed discipline-related statements, meaning that the more often teachers reported using clear instructions and transition management, the less frequently discipline-related statements appeared in their classrooms. Overall, teachers’ reported use of strategies only partially corresponded to their actual classroom behaviors, and more concrete and observable strategies, such as praise, were more likely to be validated by AI observation.

Overall, this study suggests that teachers’ beliefs, strategies, and behaviors in classroom management are not always highly aligned. Although teachers with higher self-efficacy reported using more effective strategies, they did not necessarily demonstrate clearly different classroom management behaviors in practice. Only specific and easily identifiable strategies, such as praise, were more readily confirmed through AI observation. The study therefore highlights a misalignment between teachers’ beliefs and their actual teaching behaviors, while also demonstrating the considerable potential of AI for large-scale, non-intrusive, and evidence-based research on classroom management. For educational research and teacher development, the study serves as a reminder that relying solely on teachers’ self-report questionnaires may overestimate the consistency between reported strategies and real classroom behavior. Future work should combine self-report data with more objective methods such as AI observation to gain a fuller understanding of actual classroom management practices.

Source (Open Access): Shi, Y., Wang, Z., Chen, Z., Ren, D., Liu, H., & Zhang, J. (2026). Do teachers … Read the rest

Categories
Language Development Primary School Education

Inference Training for Homonyms: Evidence from Two Randomized Controlled Trials in Primary Schools

A recent study by Booton and colleagues, investigated whether a brief lexical inference intervention could support children aged 7–8 years in learning the multiple meanings of homonyms, words that share the same spelling but carry distinct meanings (e.g., bat, bank, bark). Despite the prevalence of homonyms in everyday English and their well-documented challenge for young readers, no effective targeted intervention had previously been identified in the literature.

The researchers conducted two separate randomized controlled trials (RCTs) across English state primary schools. In Study 1, 180 children from six schools were randomly assigned to either an inference training condition (n = 60) or a spatial reasoning active control condition (n = 120). Participants attended four 30-minute intervention sessions delivered in small groups of four over a two-week period. In Study 2, 76 children, including 37 with English as an Additional Language (EAL) and 39 with English as a first language (EL1), were assigned through stratified randomisation to either the inference training (n = 40) or an implicit exposure control involving contextualised reading (n = 36). This second study also incorporated pre-registered methodology and measured metacognitive and inference skills alongside homonym knowledge.

The inference intervention, referred to as “Word Detectives,” trained children to use contextual clues within sentences to deduce the intended meaning of a homonym. Children were taught to notice, question, and infer meanings in a structured, experimenter-led format. The control groups received time-matched activities of a different nature—either spatial reasoning tasks (Study 1) or implicit reading exposure to the same target vocabulary without explicit inference instruction (Study 2). Receptive knowledge of both taught and untaught homonyms was assessed before and after the intervention using a researcher-developed homonym recognition task, while Study 2 additionally employed the York Assessment of Reading for Comprehension (YARC) to measure standardised reading comprehension.

Results from both RCTs consistently demonstrated that children in the inference training conditions made significantly greater gains in receptive homonym knowledge than their counterparts in the control groups. In Study 2, trained children also showed improved performance on the inference task itself. Importantly, while children with EAL displayed a specific baseline disadvantage in receptive homonym knowledge relative to their EL1 peers, the intervention proved equally effective for both language groups, suggesting its broad applicability across diverse classroom populations. Furthermore, receptive knowledge of homonyms and inference ability each predicted unique variance in reading comprehension scores beyond other vocabulary measures, highlighting the educational significance of homonym understanding for broader literacy outcomes.

The study did, however, identify notable limitations. Transfer of learning to untaught homonyms was limited, although error analysis suggested emergent generalisation of the inferencing strategy. The intervention window was brief (approximately two weeks), and follow-up data beyond the immediate post-test were not collected, leaving questions about the durability of gains unanswered. The researchers call for future studies with longer intervention periods, delayed follow-up assessments, and investigations into whether the intervention could be scaled for classroom delivery by teachers rather than trained researchers.

These findings carry meaningful implications for educational practice. Explicitly teaching lexical inference as a skill, rather than relying on incidental vocabulary acquisition through reading alone, may represent an efficient and equitable approach to bolstering both vocabulary and reading comprehension in the primary years, particularly in linguistically diverse classrooms where English language learners are present.

 

Source (Open Access): Booton, S. A., Birchenough, J. M., Gilligan‐Lee, K., Jelley, F., & Murphy, V. A. (2026). Lexical inference training for homonyms: Two randomized controlled trials for children with English as a first and an additional language. British Journal of Educational Psychology.

https://doi.org/10.1111/bjep.70056Read the rest

Categories
Effective Teaching Approach K-12 Education

The effect of AI-driven intelligent tutoring systems on K-12 students’ learning and performance: A systematic review

A recent systematic review published in npj Science of Learning examines the effects of intelligent tutoring systems (ITSs) on students’ learning and performance in K-12 education. As artificial intelligence in education (AIEd) has expanded rapidly, ITSs have emerged as a key application with the potential to personalize learning and improve educational outcomes. However, despite their growing adoption, their actual educational value remains uncertain. While some studies suggest that ITSs can enhance learning outcomes and even outperform traditional instruction, others report limited or inconsistent effects. In addition, existing research often conflates different educational contexts or focuses on broader AI applications, leaving a lack of systematic understanding of ITS effectiveness specifically in K-12 settings. This study therefore aims to assess the effects of ITSs on K-12 students’ learning and performance and to examine the experimental designs used to evaluate these systems.

The authors conducted a systematic review of 28 empirical studies involving a total of 4,597 students. Most studies adopted quasi-experimental designs, typically comparing an ITS-based intervention group with control conditions such as traditional teacher-led instruction, non-intelligent tutoring systems, modified ITSs, or no control group. The studies covered a range of countries, subjects, and school levels, with a strong concentration in middle and high school STEM education. Intervention durations varied considerably, from a single class session to several weeks or months. The review categorized studies based on educational context, experimental design, and intervention characteristics to enable a structured comparison of findings.

The review finds that ITSs generally have a positive effect on students’ learning and performance in K-12 education, particularly when compared to traditional teacher-led instruction, where most studies report medium to large effects. However, when compared with non-intelligent tutoring systems, the results are more mixed, with several studies finding no significant differences. Substantial heterogeneity is observed across studies due to differences in design, duration, and context. Importantly, the effectiveness of ITSs depends on key features such as personalization, adaptivity, and real-time feedback, as well as on implementation conditions. ITSs that are integrated with teacher support, encourage self-regulated learning, and are used over longer periods tend to produce better outcomes. In contrast, short interventions may be influenced by novelty effects, and learner characteristics such as prior knowledge and educational level also shape outcomes.

Taken together, the findings suggest that ITSs can enhance learning and performance in K-12 education, but their effectiveness is contingent upon pedagogical design and implementation conditions rather than technology alone. ITSs are most effective when aligned with sound instructional principles and used in combination with teacher guidance. The study also highlights limitations in the existing literature, including short intervention durations, limited sample diversity, and a lack of attention to ethical considerations. It calls for future research with more robust experimental designs, longer interventions, and greater attention to ethical issues, particularly as AI technologies continue to evolve and play an increasing role in education.

Source (Open Access): Létourneau, A., Deslandes Martineau, M., Charland, P., Karran, J. A., Boasen, J., & Léger, P. M. (2025). A systematic review of AI-driven intelligent tutoring systems (ITS) in K-12 education. npj Science of Learning10(1), 29.

https://doi.org/10.1038/s41539-025-00320-7Read the rest

Categories
Effective Teaching Approach Secondary School Education

LLM-Based Collaborative Programming: Effects on Computational Thinking and Self-Efficacy

Yan et al. (2025) examine whether integrating large language models (LLMs) into collaborative programming can enhance students’ computational thinking, self-efficacy, and learning processes. Recognizing that traditional collaborative programming is often constrained by uneven skill levels among students, the study proposes an LLM-supported collaborative framework in which AI acts as a learning partner, transforming the conventional human–human interaction into a human–human–AI collaboration model. A quasi-experimental design was conducted with 82 sixth- and seventh-grade students in China, who were randomly assigned to either an LLM-supported collaborative programming group (experiment group) or a traditional collaborative programming group (control group).

The intervention lasted five weeks and included 12 programming sessions (90 min each) using C++ as the instructional language. Students in both groups worked in teams, but the experimental group used an LLM-based platform that provided structured, problem-based, and knowledge-based scaffolding throughout the programming process, including problem analysis, coding, debugging, and evaluation. Pre- and post-tests measured students’ computational thinking and self-efficacy, while cognitive load was assessed through questionnaires, complemented by semi-structured interviews.

Results indicate that students in the LLM-supported collaborative programming group achieved significantly higher gains in computational thinking compared to those in the traditional group, though the effect size was relatively small. In addition, students in the experimental group reported significantly lower cognitive load, particularly in mental load, suggesting that LLMs can reduce the cognitive burden associated with complex programming tasks. However, no statistically significant differences were found in self-efficacy between the two groups. Both groups showed a decline in self-efficacy over time, likely due to the transition from graphical programming to more abstract text-based coding, though the decline was less pronounced in the LLM-supported group.

Qualitative findings further reveal that LLM integration enhanced students’ learning experiences by increasing interest, improving problem-solving efficiency, and supporting collaboration. Students reported that LLMs provided immediate feedback, multiple solution strategies, and personalized guidance, enabling more effective engagement in programming tasks. Overall, the study demonstrates that LLMs can function as effective scaffolding tools in collaborative learning, reducing cognitive load and enhancing higher-order thinking. While their impact on self-efficacy remains inconclusive, the findings highlight the potential of AI-supported collaborative learning environments as a promising approach for programming education in K–12 contexts.

Source (Open Access): Yan, Y. M., Chen, C. Q., Hu, Y. B., & Ye, X. D. (2025). LLM-based collaborative programming: Impact on students’ computational thinking and self-efficacy. Humanities and Social Sciences Communications12(1), 149.https://doi.org/10.1057/s41599-025-04471-1Read the rest

Categories
Primary School Education Social and Motivational Outcomes

Effects of virtual reality exercise on social skills and emotional recognition among children with autism spectrum disorder: a meta-analysis of randomized controlled trials

A meta-analysis by Cui and colleagues assessed the effects of virtual reality (VR) exercise on social skills (SS) and emotional recognition (ER) among children with autism spectrum disorder (ASD). Analysing data from randomized controlled trials published between January 2005 and October 2025 across PubMed, Web of Science, Scopus, and EBSCO databases, the authors investigated the relationship between VR exercise interventions and children’s social-emotional development outcomes.

The authors employed standardized mean difference (SMD) as the effect size, utilizing random-effects models to synthesize results across studies. VR exercise interventions were compared with standard treatment approaches. The methodology included comprehensive database searches using keywords: virtual reality, autism spectrum disorder, and children. Quality assessment followed Cochrane Handbook guidelines, with heterogeneity evaluated through I² statistics. Subgroup analyses examined intervention duration effects (< 14 weeks versus ≥ 14 weeks), and secondary outcomes included cognitive function, anxiety, language function, and depression.

The results revealed significant positive effects of VR exercise on SS (SMD = 0.94 [0.71, 1.17], p < 0.05, I² = 74%) and ER (SMD = 0.42 [0.18, 0.65], p < 0.05, I² = 0%). Furthermore, subgroup analysis demonstrated that interventions lasting less than 14 weeks (SMD = 0.63 [0.36, 0.91], p < 0.05, I² = 0%) and those exceeding 14 weeks (SMD = 1.70 [1.27, 2.13], p < 0.05, I² = 44%) both substantially improved SS, with longer interventions showing greater effect sizes. Additionally, VR exercise improved cognitive function (SMD = 0.49 [0.06, 0.93], p < 0.05, I² = 0%) and reduced anxiety (SMD = 0.56 [1.10, 0.02], p < 0.05, I² = 0%). Notably, effects on language function and depression remained unclear due to insufficient evidence.

The findings underscore the effectiveness of VR exercise as a technological intervention modality superior to standard treatment approaches in enhancing social-emotional competencies among children with ASD. Therefore, future clinical practice should consider integrating VR exercise interventions into rehabilitation programs for children with ASD, particularly emphasizing intervention duration optimization to maximize therapeutic benefits. The moderate effect sizes and cautious interpretation regarding cognitive and anxiety outcomes require validation through larger-scale longitudinal studies with standardized outcome measures.

Source (Open Access): Cui, T., Ariffin, R. B., Wang, X., & Wang, X. (2026). Effects of virtual reality exercise on social skills and emotional recognition among children with autism spectrum disorder: a meta-analysis of randomized controlled trials. BMC psychology.

https://doi.org/10.1186/s40359-026-04160-xRead the rest