Search EdWorkingPapers

Search for EdWorkingPapers here by author, title, or keywords.

Heather C. Hill

Sterling Alic, Dorottya Demszky, Zid Mancenido, Jing Liu, Heather C. Hill, Dan Jurafsky.

Responsive teaching is a highly effective strategy that promotes student learning. In math classrooms, teachers might funnel students towards a normative answer or focus students to reflect on their own thinking, deepening their understanding of math concepts. When teachers focus, they treat students’ contributions as resources for collective sensemaking, and thereby significantly improve students’ achievement and confidence in mathematics. We propose the task of computationally detecting funneling and focusing questions in classroom discourse. We do so by creating and releasing an annotated dataset of 2,348 teacher utterances labeled for funneling and focusing questions, or neither. We introduce supervised and unsupervised approaches to differentiating these questions. Our best model, a supervised RoBERTa model fine-tuned on our dataset, has a strong linear correlation of .76 with human expert labels and with positive educational outcomes, including math instruction quality and student achievement, showing the model’s potential for use in automated teacher feedback tools. Our unsupervised measures show significant but weaker correlations with human labels and outcomes, and they highlight interesting linguistic patterns of funneling and focusing questions. The high performance of the supervised measure indicates its promise for supporting teachers in their instruction.

More →

Dorottya Demszky, Jing Liu, Heather C. Hill, Dan Jurafsky, Chris Piech.

Providing consistent, individualized feedback to teachers is essential for improving instruction but can be prohibitively resource intensive in most educational contexts. We develop an automated tool based on natural language processing to give teachers feedback on their uptake of student contributions, a high-leverage teaching practice that supports dialogic instruction and makes students feel heard. We conduct a randomized controlled trial as part of an online computer science course, Code in Place (n=1,136 instructors), to evaluate the effectiveness of the feedback tool. We find that the tool improves instructors’ uptake of student contributions by 27% and present suggestive evidence that our tool also improves students’ satisfaction with the course and assignment completion. These results demonstrate the promise of our tool to complement existing efforts in teachers’ professional development.

More →

Heather C. Hill, Virginia S. Lovison.

In recent decades, U.S. education leaders have advocated for more intellectually ambitious mathematics instruction in classrooms. Evidence about whether more ambitious mathematics instruction has filtered into contemporary classrooms, however, is largely anecdotal. To address this issue, we analyzed 93 lessons recorded by a national random sample of middle school mathematics teachers. We find that lesson quality varies, with the typical lesson containing some elements of mathematical reasoning and sense-making, but also teacher-directed instruction with limited student input. Lesson quality correlates with teachers’ use of a textbook and with teachers’ mathematical background. We consider these findings in light of efforts to transform U.S. mathematics instruction.

More →

Kathryn Gonzalez, Kathleen Lynch, Heather C. Hill.

Despite growing evidence that classroom interventions in science, technology, engineering, and mathematics (STEM) can increase student achievement, there is little evidence regarding how these interventions affect teachers themselves and whether these changes predict student learning. We present results from a meta-analysis of 37 experimental studies of preK-12 STEM professional learning and curricular interventions, seeking to understand how STEM classroom interventions affect teacher knowledge and classroom instruction, and how these impacts relate to intervention impacts on student achievement. Compared with control group teachers, teachers who participated in STEM classroom interventions experienced improvements in content and pedagogical content knowledge and classroom instruction, with a pooled average impact estimate of +0.56 standard deviations. Programs with larger impacts on teacher practice yielded larger effects on student achievement, on average. Findings highlight the positive effects of STEM instructional interventions on teachers, and shed light on potential teacher-level mechanisms via which these programs influence student learning.

More →

Dorottya Demszky, Jing Liu, Zid Mancenido, Julie Cohen, Heather C. Hill, Dan Jurafsky, Tatsunori Hashimoto.

In conversation, uptake happens when a speaker builds on the contribution of their interlocutor by, for example, acknowledging, repeating or reformulating what they have said. In education, teachers' uptake of student contributions has been linked to higher student achievement. Yet measuring and improving teachers' uptake at scale is challenging, as existing methods require expensive annotation by experts. We propose a framework for computationally measuring uptake, by (1) releasing a dataset of student-teacher exchanges extracted from US math classroom transcripts annotated for uptake by experts; (2) formalizing uptake as pointwise Jensen-Shannon Divergence (pJSD), estimated via next utterance classification; (3) conducting a linguistically-motivated comparison of different unsupervised measures and (4) correlating these measures with educational outcomes. We find that although repetition captures a significant part of uptake, pJSD outperforms repetition-based baselines, as it is capable of identifying a wider range of uptake phenomena like question answering and reformulation. We apply our uptake measure to three different educational datasets with outcome indicators. Unlike baseline measures, pJSD correlates significantly with instruction quality in all three, providing evidence for its generalizability and for its potential to serve as an automated professional development tool for teachers.

More →

Heather C. Hill, Anna Erickson.

Poor program implementation constitutes one explanation for null results in trials of educational interventions. For this reason, researchers often collect data about implementation fidelity when conducting such trials. In this article, we document whether and how researchers report and measure program fidelity in recent cluster-randomized trials. We then create two measures—one describing the level of fidelity reported by authors and another describing whether the study reports null results—and examine the correspondence between the two. We also explore whether fidelity is influenced by study size, type of fidelity measured and reported, and features of the intervention. We find that as expected, fidelity level relates to student outcomes; we also find that the presence of new curriculum materials positively predicts fidelity level.

More →

Heather C. Hill, Zid Mancenido, Susanna Loeb.

Despite calls for more evidence regarding the effectiveness of teacher education practices, causal research in the field remains rare. One reason is that we lack designs and measurement approaches that appropriately meet the challenges of causal inference in the context of teacher education programs. This article provides a framework for how to fill this gap. We first outline the difficulties of doing causal research in teacher education. We then describe a set of replicable practices for developing measures of key teaching outcomes, and propose causal research designs suited to the needs of the field. Finally, we identify community-wide initiatives that are necessary to advance effectiveness research in teacher education at scale.

More →

Heather C. Hill, Erica Litke, Kathleen Lynch.

For nearly three decades, policy-makers and researchers in the United States have promoted more intellectually rigorous standards for mathematics teaching and learning. Yet, to date, we have limited descriptive evidence on the extent to which reform-oriented instruction has been enacted at scale.

The purpose of the study is to examine the prevalence of reform-aligned mathematics instructional practices in five U.S. school districts. We also seek to describe the range of instruction students experience by presenting case studies of teachers at high, medium and low levels of reform alignment.

We draw on 1,735 video-recorded lessons from 329 elementary teachers in these five U.S. urban districts.

Research Design:
We present descriptive analyses of lesson scores on a mathematics-focused classroom observation instrument. We also draw upon interviews with district personnel, rater-written lesson summaries, and lesson video in order to develop case studies of instructional practice.

We find that teachers in our sample do use reform-aligned instructional practices, but that they do so within the confines of traditional lesson formats. We also find that the implementation of these instructional practices varies in quality. Furthermore, the prevalence and strength of these practices corresponds to the coherence of district efforts at instructional reform.

Our findings suggest that unlike other studies in which reform-oriented instruction rarely occurred (e.g. Kane & Staiger, 2012), reform practices do appear to some degree in study classrooms. In addition, our analyses suggest that implementation of these reform practices corresponds to the strength and coherence of district efforts to change instruction.

More →

Heather C. Hill, Derek C. Briggs.

Federal policy has both incentivized and supported better use of research evidence by educational leaders.  However, the extent to which these leaders are well-positioned to understand foundational principles from research design and statistics, including those that underlie the What Works Clearinghouse ratings of research studies, remains an open question. To investigate educational leaders’ knowledge of these topics, we developed a construct map and items representing key concepts, then conducted surveys containing those items with a small pilot sample (n=178) and a larger nationally representative sample (n=733) of educational leaders. We found that leaders’ knowledge was surprisingly inconsistent across topics. We also found most items were answered correctly by less than half of respondents, with cognitive interviews suggesting that some of those correct answers derived from guessing or test-taking techniques. Our findings identify a roadblock to policymakers’ contention that educational leaders should use research in decision-making.  

More →

Kathleen Lynch, Heather C. Hill, Kathryn Gonzalez, Cynthia Pollard.

More than half of U.S. children fail to meet proficiency standards in mathematics and science in fourth grade. Teacher professional development and curriculum improvement are two of the primary levers that school leaders and policymakers use to improve children’s science, technology, engineering and mathematics (STEM) learning, yet until recently, the evidence base for understanding their effectiveness was relatively thin. In recent years, a wealth of rigorous new studies using experimental designs have investigated whether and how STEM instructional improvement programs work. This article highlights contemporary research on how to improve classroom instruction and subsequent student learning in STEM. Instructional improvement programs that feature curriculum integration, teacher collaboration, content knowledge, pedagogical content knowledge, and how students learn all link to stronger student achievement outcomes. We discuss implications for policy and practice.

More →