Categories
Effective Teaching Approach K-12 Education Maths and Science Learning

Empowering tutoring expertise with AI

Effective tutoring can significantly improve student learning outcomes, but many students, particularly in under-served communities, often lack access to high-quality, expert-guided instruction due to resource limitations and the scarcity of trained educators. Stanford University researchers conducted the first randomized controlled trial of Tutor CoPilot, a Human-AI system designed to provide real-time, expert-like guidance to K-12 tutors, to explore its impact on enhancing tutor effectiveness during live sessions. In collaboration with FEV Tutor and a U.S. Southern school district, the researchers conducted an intervention involving 900 tutors and 1,800 students from Title I schools participating in an in-school, virtual tutoring program focused on mathematics.

The study showed that students whose tutors used Tutor CoPilot were 4 percentage points more likely to master mathematical lesson topics compared to those in the control group. This effect was especially pronounced among students taught by lower-rated tutors, whose mastery improved by 9 percentage points. The system also promoted the use of expert teaching strategies, such as prompting students to explain their reasoning and asking guiding questions, rather than giving away answers, fostering deeper student understanding. Despite some challenges, such as occasional misalignment of AI suggestions with student grade levels, tutors reported that Tutor CoPilot helped them better address student needs. With an annual cost of just $20 per tutor, Tutor CoPilot offers a scalable and affordable path to improving tutoring quality in contexts where expert educators are in short supply. This study illustrates the potential of Human-AI systems like Tutor CoPilot to make high-quality learning accessible to all students.

 

Source (Open Access): Wang, Rose E., Ribeiro, Ana T., Robinson, Carly D., Loeb, Susanna, & Demszky, Dorottya. (2024). Tutor copilot: A human-AI approach for scaling real-time expertise. (EdWorkingPaper: 24 -1056). Retrieved from Annenberg Institute at Brown University: https://doi.org/10.26300/81NH-8262Read the rest

Categories
Language Development Maths and Science Learning Primary School Education

Measuring mathematical language: Lessons from elementary classrooms

Mathematical language plays an important role in helping students understand math concepts, as it serves as both a communication tool and a foundation for problem-solving skills. In a working paper released by the Annenberg Institute at Brown University, researchers provided the first large-scale quantitative analysis of mathematical language use in upper elementary classrooms. Conducted across 1,657 math lessons in 317 classrooms over three years, the research employed natural language processing (NLP) to examine how frequently teachers and students used mathematical vocabulary during lessons.

The study found wide differences in the use of mathematical language among teachers. Some teachers used mathematical terms much more frequently than others, modeling mathematical language 127 times per lesson on average. However, the study suggests that merely exposing students to mathematical vocabulary does not significantly lead to greater student usage of those terms. Teachers in the 75th percentile used 28 more mathematical terms per lesson than those in the 25th percentile, which amounts to approximately 4,480 additional exposures to mathematical terms over the course of a school year. This higher use of mathematical language was associated with improved student test scores, particularly when teachers employed additional strategies to engage students with these terms.

The study’s findings suggest that encouraging student use of mathematical language requires more than just teacher modeling. Instead, effective math instruction stems from a broader set of practices that go beyond vocabulary use. While the study points to a correlation between teacher use of mathematical vocabulary and student achievement, it emphasizes that future research should explore the quality of vocabulary use and its impact on long-term learning outcomes.

 

Source (Open Access): Himmelsbach, Zachary, Heather C. Hill, Jing Liu, and Dorottya Demszky. (2024). A Quantitative study of mathematical language in upper elementary classrooms. (EdWorkingPaper: 24 -1029). Retrieved from Annenberg Institute at Brown University: https://doi.org/10.26300/1zcm-d071Read the rest

Categories
Language Development Maths and Science Learning Secondary School Education

Assessing the quality of feedback from humans and ChatGPT on students’ writing

Automated writing evaluation (AWE) tools aid educators by quickly evaluating student writing, but often at the expense of accuracy and clarity, and setting them up requires substantial effort and resources. In contrast, generative AI like ChatGPT provides timely, targeted, and adaptive feedback that can enhance student writing. A recent study compared the quality of feedback from ChatGPT and human raters, with potential implications for process-based writing evaluation.

The study involved 200 students randomly sampled across grades 6-12 from two school districts in Southern California. Over two 50-minute class periods, students were tasked with responding to one of two writing prompts. Each prompt demanded that they read four primary and secondary sources and craft an argument analyzing historical interpretations using evidence, reasoning, and required writing skills. Sixteen experienced secondary educators from various disciplines were recruited and trained to provide formative feedback. This feedback was then compared to ChatGPT’s feedback, which was prompted to reflect the same context used by the educators. The researchers coded and analyzed the feedback for comparison.

The findings revealed that well-trained, paid, and relatively time-rich human raters offered higher quality feedback in four of five key areas: clarity of improvement directions, accuracy, prioritization of essential features, and use of a supportive tone (the fifth area: criteria-based feedback). However, ChatGPT’s writing feedback closely matched human feedback in quality, without requiring any training, and the differences in quality between ChatGPT and human feedback were modest, considering both overall quality and time savings. The researchers concluded that generative AI could be a useful tool in specific situations, especially for formative drafts or when highly trained educators are unavailable.

 

Source (Open Access): Steiss, J., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., Moon, Y., Tseng, W., Warschauer, M., & Olson, C. B. (2024). Comparing the quality of human and ChatGPT feedback of students’ writing. Learning and Instruction, 91, 101894. https://doi.org/10.1016/j.learninstruc.2024.101894Read the rest

Categories
K-12 Education Language Development Maths and Science Learning

Are mathematics and writing skills related?

A substantial body of research has established links between reading and mathematics skills, as well as between reading and writing. Moreover, previous studies suggest that these skills share abilities such as executive function and higher-order cognition. It is therefore reasonable to hypothesize a connection between mathematics and writing skills.

A meta-analysis by Kim and colleagues investigated the correlation between mathematics and writing skills. The authors categorized both writing and mathematics skills into lower-order and higher-order subskills. In mathematics, skills involving information retrieval and understanding of magnitude (e.g., arithmetic, calculation fluency) are considered lower-order or foundational skills, whereas skills involving reasoning and comprehension (e.g., word-problem solving, data interpretation) are higher-order skills. In writing, transcription skills (spelling and handwriting/keyboarding) are lower-order, while written composition is a higher-order skill.

The meta-analysis included 211 studies with 564 effect sizes, primarily from English-speaking participants. Utilizing robust variance estimation, the results showed that the overall correlation between mathematics and writing skills was moderate (r = .48), with grade level significantly moderating this correlation. The effect was found strongest among lower primary students (K-Grade 2: r = 0.52), decreased through upper elementary (Grade 3-6: r=0.42), to college students and adults levels (r = 0.30).

Given the potential overlap between grade level and skill level, the analysis was further disaggregated by grade level. In primary grades, the lower-order writing and lower-order mathematics skills were moderately correlated (r=0.59), while the correlation between higher-order writing and higher-order mathematics was 0.48. Among university learners, the lower-order link was 0.36, and the higher-order link was 0.25.

The findings support a substantial link between mathematics and writing skills. However, the correlations decreased as grade level increased, and the higher-order link was comparatively weaker. The authors speculated a twofold explanation: a plateau of foundational skills and the possibility that different cognitive skills might be involved in higher-order tasks, warranting further exploration.

 

Source (Open Access): Kim, Y.-S. G., Yang, D., & Hwang, J. (2024). Are mathematics and writing skills related? Evidence from meta-analysis. Educational Psychology Review, 36(4), 125. https://doi.org/10.1007/s10648-024-09960-4Read the rest

Categories
Maths and Science Learning Programme Evaluation Secondary School Education

Generative AI’s impact on student learning: A double-edged sword

Generative AI tools, like OpenAI’s GPT-4, are increasingly integrated into educational settings, promising enhanced learning and productivity. However, their long-term impact on skill acquisition remains under scrutiny. A recent research paper from the Wharton School of the University of Pennsylvania details a large-scale randomized controlled trial (RCT) designed to assess how generative AI affects student learning, focusing specifically on math classes in a high school setting. The study involved nearly 1,000 students across three grades, evaluating two GPT-4-based tutors: GPT Base and GPT Tutor. The GPT Base provided a standard ChatGPT interface, while GPT Tutor incorporated safeguards to support learning without providing direct answers.

The results revealed that while GPT-4 significantly improved immediate performance on practice problems—by 48% for GPT Base and 127% for GPT Tutor—these gains did not translate into long-term learning. When access to GPT-4 was removed during exams, students who had used GPT Base performed 17% worse than those who never had access, indicating a detrimental effect on learning. However, GPT Tutor mitigated this negative impact, with performance differences becoming statistically insignificant.

The study underscores the potential of generative AI to enhance short-term performance but also highlights the risk of overreliance on these tools, which can inhibit the development of essential problem-solving skills. As educational institutions increasingly turn to AI-driven tools, the findings stress the need for carefully designed safeguards to ensure that students continue to learn and retain critical skills over time.

 

Source (Open Access): Bastani, H., Bastani, O., Sungu, A., Ge, H., Kabakcı, Ö., & Mariman, R. (2024). Generative AI can harm learning. The Wharton School Research Paper. https://www.ssrn.com/abstract=4895486Read the rest

Categories
K-12 Education Maths and Science Learning Programme Evaluation

Factors influencing the impact of math interventions in K-12 education

It is commonly understood that educational interventions are not universally effective; their impact varies depending on specific contexts and conditions. A systematic review with meta-analysis by Megan Rojo and her colleagues examined the influence of intervention and study characteristics in determining the effects of K-12 programs for students with mathematical difficulties. They focused on grade level, group size, content area, and dosage for the intervention characteristics, and on research design, implementation fidelity, year of study, type of measure, and study quality for study characteristics.

The review included 286 studies, 119 of which were randomized trials, 16 quasi-experiments, and 86 single-subject designs. In the model excluding extreme values, the authors found the following findings for intervention characteristics.

  • Problem-solving interventions and interventions involving operations were found to be less effective than those focusing on fractions. The authors suggested the complex nature of fractions and the specialized knowledge of research teams dedicated to this specific area as potential explanations.
  • No clear pattern was found for duration. Longer interventions (more than 22 hours) seemed to have a larger impact, but further research is needed.
  • Grade level (K–2 vs. 3–5, vs. 6–12) was not found to influence the impact of the interventions.
  • Similar effects were observed in interventions delivered either in small groups or individually, which supports findings from previous reviews.

Study characteristics were considered by the authors because the way the studies were designed and conducted may affect their effect sizes (read related BEiB summaries on this topic here). They found that single-subject designs had larger effects compared to quasi-experimental and randomized studies. Measures created by the researchers produced larger effects than independent measures. They concluded that those features should be addressed in future research.

Source: Rojo, M., Gersib, J., Powell, S. R., Shen, Z., King, S. G., Akther, S. S., Arsenault, T. L., Bos, S. E., Lariviere, D. O., & Lin, X. (2024). A meta-analysis of mathematics interventions: Examining the impacts of intervention characteristics. Educational Psychology Review, 36(1), 9. https://doi.org/10.1007/s10648-023-09843-0Read the rest