Categories
K-12 Education Programme Evaluation

The Effects of Integrated STEM Education on K12 Students’ Achievements: A Meta-Analysis

Integrated STEM education refers to T&L of scientific, technological, engineering, and mathematical knowledge and skills in integrative ways, emphasizing the connection between abstract knowledge and real-world problems. Integrated STEM education is characterized by four core features: multidisciplinary integration, real-world application, authentic inquiry or design-based practice, and active student learning. Based on 124 extracted and coded studies (2010-2022), Chen et al.’s (2025) meta-analysis reports on the effects of integrated STEM education based on three main types of interventions: (1) adopting integrated STEM education, (2) using extra teaching and learning strategies to enhance integrated STEM education, and (3) using specific learning technologies to support integrated STEM education.

All three types of interventions yielded a medium effect on knowledge acquisition and a small effect on student perceptions. Besides, adopting integrated STEM education had a large effect on cognitive skills; using extra teaching and learning strategies in integrated STEM programs produced a medium effect on cognitive skills and problem-solving task performance; using specific learning technologies had a small effect on problem-solving task performance. Some factors, such as task type (inquiry or design-based task) and program duration, may influence STEM learning outcomes.

To maximize the efficacy of integrated STEM education, practitioners should embed its four core characteristics into curriculum design while favoring short-to-medium duration programs (one month to a semester). Educators must carefully balance hands-on design and minds-on inquiry tasks by providing necessary scaffolding tailored to students’ prior knowledge. Furthermore, deploying targeted instructional strategies and learning technologies can enhance engagement with complex, real-world problems. Ultimately, evaluating these programs requires a multidimensional approach that prioritizes skill development and practical problem-solving performance alongside traditional knowledge acquisition.

Source (Open Access): Chen, B., Chen, J., Wang, M., Tsai, C. C., & Kirschner, P. A. (2026). The effects of integrated STEM education on K12 students’ achievements: A meta-analysis. Review of Educational Research96(2), 619-668.

https://doi.org/10.3102/00346543251318297Read the rest

Categories
Language Development Primary School Education Secondary School Education

The Effects of Interventions for Students With Reading Difficulties in Grades 4–12

Killingly and colleagues conducted a systematic review and meta-analysis of school-based interventions for students with reading difficulties in Grades 4–12 published between 2011 and 2023. The study examined the overall effectiveness of these interventions and further tested whether study characteristics, sample characteristics, and intervention characteristics moderated their effects. A total of 104 publications and 586 effect sizes were included, representing 97,114 participants. Methodologically, the authors used a Correlated and Hierarchical Effects model combined with robust variance estimation to address dependency among multiple outcomes and effect sizes within the same study, while also estimating effects across overall reading performance and specific reading domains.

The results showed that, overall, reading interventions had a small but significant positive effect for students with reading difficulties in Grades 4–12, with an overall effect size of g = 0.212 (95% CI [0.163, 0.261], p < .001). This suggests that although the gains were not large, these interventions did produce reliable improvements in students’ reading performance. Across specific reading domains, the strongest effects were found for vocabulary (g = 0.422), followed by decoding/word recognition (g = 0.199) and reading comprehension (g = 0.187). Fluency showed only a very small but significant effect (g = 0.080), spelling was not significant (g = 0.015), and phonological processing, although showing a larger effect size on the surface (g = 0.531), did not reach significance and was therefore considered unstable. Overall heterogeneity was very high (I² = 89.71%), indicating that differences in study design and sample characteristics had a substantial influence on intervention effectiveness. GRADE assessment further suggested that the overall quality of evidence ranged from moderate to high, with the strongest evidence for fluency, moderate evidence for vocabulary, and moderate-to-low evidence for phonological processing.

Moderator analyses showed that intervention effects varied according to both study and sample conditions. Overall, more recently published studies showed stronger effects (β = 0.015), and journal articles produced significantly larger effects (g = 0.268) than research reports (g = 0.062). In terms of sample characteristics, low socioeconomic status was not significantly related to overall effects, but a higher proportion of students with learning disabilities was associated with slightly stronger effects (β = 0.006). For students from a language background other than English, overall differences were not significant, but in the vocabulary domain, a greater proportion of such students was associated with stronger effects (β = 0.016), suggesting that vocabulary instruction may be particularly important for this group. Regarding intervention design, intervention focus, duration, and measurement type were all significant moderators. Comprehension-focused interventions showed relatively strong overall effects (g = 0.313), multicomponent interventions showed stable effects (g = 0.178), and word study interventions had smaller effects (g = 0.096), whereas vocabulary-focused interventions, though fewer in number, showed the largest effect (g = 0.716). Shorter interventions were actually associated with stronger effects, with effect sizes of g = 0.405 for 0–5 hours and g = 0.409 for 6–15 hours. In addition, researcher-developed measures yielded significantly larger effects (g = 0.542) than standardized measures (g = 0.127). Although there was no significant overall difference between interventions delivered by teachers and those delivered by researchers, in vocabulary interventions teacher-led delivery produced stronger effects (g = 0.733) than researcher-led delivery (g = 0.249), suggesting that classroom teachers may hold particular advantages in providing vocabulary support.

Overall, this study shows that reading interventions for older students with reading difficulties are indeed effective, although the magnitude of their effects depends on the reading domain being targeted and on the design of the intervention. Vocabulary and reading comprehension appear to be the most promising focuses, while multicomponent interventions also demonstrate stable benefits. By … Read the rest

Categories
K-12 Education Social and Motivational Outcomes

Social and Emotional Learning Programs and Students’ Prosocial Behavior: A Meta-Analysis

A recent meta-analysis conducted by Hung and colleagues examined the effectiveness of school-based social and emotional learning (SEL) programs for K–12 students’ prosocial behavior. Prosocial behavior is conceptualized as any voluntary behavior intended to benefit others, such as helping, sharing, comforting, and defending others. The researchers analyzed 66 studies and 157 effect sizes involving 52,914 youth.

Effect sizes were calculated using Hedges’ g, which includes a small sample bias correction to the effect size estimate to account for small studies. Because most studies contributed multiple effect sizes, the authors used a correlated effects (CE) model with robust variance estimation (RVE) to account for within-study dependence among effect sizes. Approach refers to whether an SEL program takes a curricular, interactional, structural, or combined approach. To examine whether the effect of SEL programs on prosocial behavior was moderated by sample, program, methodological, and publication characteristics, the authors conducted a mixed-effects meta-regression analysis with all moderators added to the model simultaneously. In total, they investigated 14 moderating variables such as approach, school level, urbanicity, and dosage.

The remaining moderating variables yielded no statistically significant differences. Results indicated that effects for rural, suburban, and combination areas were not statistically different from samples from urban areas. Results also indicated that effects were not statistically significantly different between samples with higher versus lower proportions of students qualifying for free or reduced-price lunch. Effects of curricular and curricular combined with structural or interactional approaches were not significantly different from SEL programs that only used an interactional approach. Findings indicated that effects of studies delivered at Tier 2 were not statistically different from Tier 1 studies. Studies that used a quasi-experimental design and single-group pre–post design yielded similar effects on prosocial behavior compared to studies that used a randomized controlled trial. Effects were similar across different types of prosocial behavior measures. Effects from studies that did not meet baseline equivalence were not statistically significantly different from studies that met baseline equivalence, and effects from studies that did not report implementation fidelity were not statistically significantly different from studies that reported fidelity of implementation. Results indicated that effects from studies conducted across earlier decades were not statistically significantly different from studies conducted more recently, and effects were similar for peer-reviewed and non-peer-reviewed studies.

The authors further noted that most studies were conducted with elementary school children (56%), the majority implemented universal Tier 1 interventions (89%), and a curricular approach was the most common (77%). Additionally, a considerable proportion of studies did not report key demographic data, with 71% failing to report free or reduced-price lunch rates.

A key implication for practice from this meta-analytic review is that school-based SEL programs are effective in promoting K–12 students’ prosocial behavior, and that “more is not necessarily better” — a moderate dosage and moderate duration may be most ideal. Future policy and practice should take into account this “less is more” finding. At the same time, more research is needed involving secondary schools, rural schools, non-curricular approaches, and diverse student populations in order to fully understand the effectiveness of SEL programs.

Source (Open Access): Hung, C., Brass, N. R., Brockmeier, L., Bergin, C., Imler, M., & Luper, S. B. (2026). Social and Emotional Learning Programs and Students’ Prosocial Behavior: A Meta-Analysis. Review of Educational Research, 00346543261438462.

https://doi.org/10.3102/00346543261438462Read the rest

Categories
Primary School Education Social and Motivational Outcomes

Effects of virtual reality exercise on social skills and emotional recognition among children with autism spectrum disorder: a meta-analysis of randomized controlled trials

A meta-analysis by Cui and colleagues assessed the effects of virtual reality (VR) exercise on social skills (SS) and emotional recognition (ER) among children with autism spectrum disorder (ASD). Analysing data from randomized controlled trials published between January 2005 and October 2025 across PubMed, Web of Science, Scopus, and EBSCO databases, the authors investigated the relationship between VR exercise interventions and children’s social-emotional development outcomes.

The authors employed standardized mean difference (SMD) as the effect size, utilizing random-effects models to synthesize results across studies. VR exercise interventions were compared with standard treatment approaches. The methodology included comprehensive database searches using keywords: virtual reality, autism spectrum disorder, and children. Quality assessment followed Cochrane Handbook guidelines, with heterogeneity evaluated through I² statistics. Subgroup analyses examined intervention duration effects (< 14 weeks versus ≥ 14 weeks), and secondary outcomes included cognitive function, anxiety, language function, and depression.

The results revealed significant positive effects of VR exercise on SS (SMD = 0.94 [0.71, 1.17], p < 0.05, I² = 74%) and ER (SMD = 0.42 [0.18, 0.65], p < 0.05, I² = 0%). Furthermore, subgroup analysis demonstrated that interventions lasting less than 14 weeks (SMD = 0.63 [0.36, 0.91], p < 0.05, I² = 0%) and those exceeding 14 weeks (SMD = 1.70 [1.27, 2.13], p < 0.05, I² = 44%) both substantially improved SS, with longer interventions showing greater effect sizes. Additionally, VR exercise improved cognitive function (SMD = 0.49 [0.06, 0.93], p < 0.05, I² = 0%) and reduced anxiety (SMD = 0.56 [1.10, 0.02], p < 0.05, I² = 0%). Notably, effects on language function and depression remained unclear due to insufficient evidence.

The findings underscore the effectiveness of VR exercise as a technological intervention modality superior to standard treatment approaches in enhancing social-emotional competencies among children with ASD. Therefore, future clinical practice should consider integrating VR exercise interventions into rehabilitation programs for children with ASD, particularly emphasizing intervention duration optimization to maximize therapeutic benefits. The moderate effect sizes and cautious interpretation regarding cognitive and anxiety outcomes require validation through larger-scale longitudinal studies with standardized outcome measures.

Source (Open Access): Cui, T., Ariffin, R. B., Wang, X., & Wang, X. (2026). Effects of virtual reality exercise on social skills and emotional recognition among children with autism spectrum disorder: a meta-analysis of randomized controlled trials. BMC psychology.

https://doi.org/10.1186/s40359-026-04160-xRead the rest

Categories
Achievement K-12 Education Maths and Science Learning

The Impact of Mathematics and Science Professional Development on Teacher Knowledge, Instruction, and Student Achievement

A recent meta-analysis by Lynch and colleagues examined the effectiveness of professional development (Professional Development) interventions for mathematics and science teachers in grades PK-12. Analyzing 200 effect sizes for teacher outcomes and 126 effect sizes for student achievement from 46 experimental studies published from 2001 to 2024, the authors investigated how PD programs affect teachers’ knowledge and classroom instruction, and whether these changes translate into improved student learning.

The authors employed Hedges’s g as the effect size metric, using randomized controlled trial designs to ensure causal inference. PD interventions were categorized by their focus areas: improving teacher knowledge (content knowledge and pedagogical content knowledge), content-specific and content-general instructional strategies, and content-specific formative assessment. The researchers also examined contextual factors such as intervention duration, inclusion of curriculum materials, and school demographics.

The results revealed a significant positive impact of PD on teacher outcomes (pooled average: +0.52 SD). Specifically, teacher knowledge improved by +0.52 SD and classroom instruction by +0.49 SD. Importantly, programs with larger impacts on teacher outcomes also demonstrated significantly larger effects on student achievement. A 1 SD improvement in teacher-level outcomes was associated with a +0.18 SD gain in student achievement. Notably, improvements in classroom instruction showed a stronger link to student learning (+0.24 SD) than knowledge gains (+0.08 SD, not statistically significant). PD programs explicitly focusing on teacher knowledge development (effect size difference: +0.18 SD) and content-specific formative assessment (+0.27 SD) showed significantly stronger impacts on classroom instruction. Interestingly, intervention duration and the inclusion of curriculum materials did not significantly moderate outcomes.

The findings underscore that the quality and specific focus of professional development matter more than duration. Schools should prioritize PD programs that explicitly target both teacher knowledge and instructional practices, particularly emphasizing formative assessment strategies. The strong link between improved instruction and student achievement validates investments in high-quality professional development as a lever for enhancing educational outcomes in mathematics and science.

Source (Open Access): Lynch, K., Gonzalez, K., Hill, H., & Merritt, R. (2025). A meta-analysis of the experimental evidence linking mathematics and science professional development interventions to teacher knowledge, classroom instruction, and student achievement. AERA Open11, 23328584251335302.https://doi.org/10.1177/23328584251335302Read the rest

Categories
K-12 Education Programme Evaluation

Gender Disparity in Computational Thinking Pedagogy and Assessment: A Three-Level Meta-Analysis

Gender disparities in computational thinking (CT) education are widely acknowledged, but few meta-analyses have investigated how particular instructional approaches and assessment settings shape these differences. To address this research gap, Liu et al. (2025) conducted a meta-analysis of 53 empirical studies, covering 100 effect sizes and a total sample of 15,454 participants, to examine the overall magnitude of gender differences in CT education and the factors that may shape them. The findings show a small but statistically significant overall gender difference (g = 0.106, 95% CI [0.024, 0.188], p < .05), suggesting a slight advantage for males.

Regarding moderation effects, neither general study features (e.g., publication type, geographic region, and educational level) nor CT assessment contexts (e.g., the instrument used and the learning outcome measured) significantly altered the effect sizes. In contrast, pedagogical approaches did matter: technology-integrated strategies such as mixed and plugged approaches were linked to larger gender gaps favoring boys, whereas unplugged approaches tended to narrow the gap and sometimes even shifted the advantage toward girls. In terms of assessment, gender differences were not significant when CT concepts were measured, but they became significant when outcomes involved authentic practices (such as programming tasks) and identity-related dimensions (such as motivation, learning interest, and self-efficacy).

The results highlight clear implications for improving equity in CT education. Support should start early in K–12, with a particular focus on developing students’ CT practices and perspectives so that small gender gaps do not become persistent over time. Unplugged activities can serve as a low-barrier entry point, strengthening basic understanding and confidence, especially for girls. In addition, technology use should be introduced progressively: when digital and AI tools are scaffolded within supportive, culturally relevant learning settings, students may feel less anxious about technology and experience more inclusive participation.

 

Source (Open Access): Liu, S., Dai, Y., Ng, O. L., & Cai, Z. (2025). Gender Disparity in Computational Thinking Pedagogy and Assessment: A Three-Level Meta-Analysis. Educational Psychology Review37(4), 114.

https://doi.org/10.1007/s10648-025-10095-3Read the rest