Search EdWorkingPapers by author, title, or keywords.
Classroom discourse is a core medium of instruction --- analyzing it can provide a window into teaching and learning as well as driving the development of new tools for improving instruction. We introduce the largest dataset of mathematics classroom transcripts available to researchers, and demonstrate how this data can help improve instruction. The dataset consists of 1,660 45-60 minute long 4th and 5th grade elementary mathematics observations collected by the National Center for Teacher Effectiveness (NCTE) between 2010-2013. The anonymized transcripts represent data from 317 teachers across 4 school districts that serve largely historically marginalized students. The transcripts come with rich metadata, including turn-level annotations for dialogic discourse moves, classroom observation scores, demographic information, survey responses and student test scores. We demonstrate that our natural language processing model, trained on our turn-level annotations, can learn to identify dialogic discourse moves and these moves are correlated with better classroom observation scores and learning outcomes. This dataset opens up several possibilities for researchers, educators and policymakers to learn about and improve K-12 instruction.
We examine the labor supply decisions of substitute teachers – a large, on-demand market with broad shortages and inequitable supply. In 2018, Chicago Public Schools implemented a targeted bonus program designed to reduce unfilled teacher absences in largely segregated Black schools with historically low substitute coverage rates. Using a regression discontinuity design, we find that incentive pay substantially improved coverage equity and raised student achievement. Changes in labor supply were concentrated among Black and Hispanic substitutes from nearby neighborhoods with experience in incentive schools. Wage elasticity estimates suggest incentives would need to be 50% of daily wages to close fill-rate gaps.
Recent public discussions and legal decisions suggest that school segregation will remain persistent in the United States, but increased transparency may help monitor spending across schools. These circumstances revive an old question: is it possible to achieve an educational system that is separate but equal—or better—in terms of spending? This question motivates further understanding the measurement of spending progressivity and its association with segregation. Focusing on economic disadvantage, we compare two commonly-used measures of spending progressivity: exposure-based and slope-based. We show that each measure is predicated on different assumptions about the progressivity of within-school resource allocations, and that they are theoretically linked through segregation. We empirically examine school spending progressivity and its properties using nationwide school spending data from the 2018-19 school year. Consistent with our theory, the exposure-based measure is the slope-based measure shrunk inversely by economic school segregation. This property makes more segregated school districts look more progressive on the exposure-based measure, representing a seemingly “separate but better” relationship. However, we show that this provocative pattern may be reversed by relatively modest poor-versus-nonpoor differences in unobserved parental contributions. We discuss implications for the measurement of progressivity, and for theory on public educational investments broadly.
This simulation study examines the characteristics of the Explanatory Item Response Model (EIRM) when estimating treatment effects when compared to classical test theory (CTT) sum and mean scores and item response theory (IRT)-based theta scores. Results show that the EIRM and IRT theta scores provide generally equivalent bias and false positive rates compared to CTT scores and superior calibration of standard errors under model misspecification. Analysis of the statistical power of each method reveals that the EIRM and IRT theta scores provide a marginal benefit to power and are more robust to missing data than other methods when parametric assumptions are met and provide a substantial benefit to power under heteroskedasticity, but their performance is mixed under other conditions. The methods are illustrated with an empirical data application examining the causal effect of an elementary school literacy intervention on reading comprehension test scores and demonstrates that the EIRM provides a more precise estimate of the average treatment effect than the CTT or IRT theta score approaches. Tradeoffs of model selection and interpretation are discussed.
Districts nationwide have revised their educator evaluation systems, increasing the frequency with which administrators observe and evaluate teacher instruction. Yet, limited insight exists on the role of evaluator feedback for instructional improvement. Relying on unique observation-level data, we examine the alignment between evaluator and teacher assessments of teacher instruction and the potential consequences for teacher productivity and mobility. We show that teachers and evaluators typically rate teacher performance similarly during classroom observations, but with significant variability in teacher-evaluator ratings. While teacher performance improves across multiple classroom observations, evaluator ratings likely overstate productivity improvements among the lowest-performing teachers. Evaluators, but not teachers, systematically rate teacher performance lower in classrooms serving higher concentrations of economically disadvantaged students. And while teacher performance improves when evaluators provide more critical feedback about teacher instruction, teachers receiving critical feedback may seek alternative teaching assignments in schools with less critical evaluation settings. We discuss the implications of these findings for the design, implementation and impact of educator evaluation systems.
Billions of dollars are invested in opt-in, educational resources to accelerate students’ learning. Although advertised to support struggling, marginalized students, there is no guarantee these students will opt in. We report results from a school system’s implementation of on-demand tutoring. The take up was low. At baseline, only 19% of students ever accessed the platform, and struggling students were far less likely to opt in than their more engaged and higher achieving peers. We conducted a randomized controlled trial (N=4,763) testing behaviorally-informed approaches to increase take-up. Communications to parents and students together increase the likelihood students access tutoring by 46%, which led to a four-percentage point decrease in course failures. Nonetheless, take-up remained low, showing concerns that opt-in resources can increase—instead of reduce—inequality are valid. Without targeted investments, opt-in educational resources are unlikely to reach many students who could benefit.
We investigate whether and how Achieve Atlanta’s college scholarship and associated services impact college enrollment, persistence, and graduation among Atlanta Public School graduates experiencing low household income. Qualifying for the scholarship of up to $5,000/year does not meaningfully change college enrollment among those near the high school GPA eligibility thresholds. However, scholarship receipt does have large and statistically significant effects on early college persistence (i.e., 14%) that continue through BA degree completion within four years (22%). We discuss how the criteria of place-based programs that support economically disadvantaged students may influence results for different types of students.
Books shape how children learn about society and norms, in part through representation of different characters. We introduce new artificial intelligence methods for systematically converting images into data and apply them, along with text analysis methods, to measure the representation of race, gender, and age in award-winning children’s books from the past century. We find that more characters with darker skin color appear over time, but the most influential books persistently depict a greater proportion of light-skinned characters than other books, even after conditioning on race; we also find that children are depicted with lighter skin than adults. Relative to their growing share of the U.S. population, Black and Latinx people are underrepresented in these same books, while White males are overrepresented. Over time, females are increasingly present but appear less often in text than in images, suggesting greater symbolic inclusion in pictures than substantive inclusion in stories. We then report empirical evidence for predictions about the supply of and demand for representation that would generate these patterns. On the demand side, we show that people consume books that center their own identities. On the supply side, we document higher prices for books that center non-dominant social identities and fewer copies of these books in libraries that serve predominantly White communities. Lastly, we show that the types of children’s books purchased in a neighborhood are related to local political beliefs.
Increased exposure to gender-role information affects a girl's educational performance. Utilizing the classroom randomization in Chinese middle schools, we find that the increased presence of stay-at-home peer mothers significantly reduces a girl's performance in mathematics. This exposure also cultivates gendered attitudes towards mathematics and STEM professions. The influence of peer mothers increases with network density and when the girl has a distant relationship with her parents. As falsification tests against unobserved confounding factors, we find that the exposure to stay-at-home peer mothers does not affect boys' performance, nor do we find that stay-at-home peer fathers affect girls' outcomes.
The role of racial diversity at college campuses has been debated for over a half a century with limited quasi-experimental evidence from classrooms. To fill this void, I estimate the extent that classmate racial compositions affect Hispanic and African-American students at a large and over-subscribed California community college where they are minorities. I find that when minority students are exposed to a greater share of same race classmates, they are more likely to complete the class with a pass and are more likely to enroll in a same subject course the subsequent term. The findings are robust to first-time students with the lowest registration priority vs. all students and different combinations of fixed effects (e.g., student, class, and instructor race).