Search EdWorkingPapers

Search EdWorkingPapers by author, title, or keywords.

Methodology, measurement and data

Edward J. Kim, Luke W. Miratrix.

Greater school choice leads to lower demand for private tutoring according to various international studies, but this has not been explicitly tested for the U.S. context. To estimate the causal effect of charter school appearances on neighboring private tutoring prevalence, we employ a comparative event study model combined with a longitudinal matching strategy to accommodate differing treatment years. In contrast to findings from other countries, we estimate that charter schools increase, rather than decrease, tutoring prevalence in the United States. We further find that the effect varies considerably based on the characteristics of the treated neighborhood: areas with the highest income, educational attainment, and proportion Asian show the greatest treatment impacts, while the areas with the least show null effects. Moreover, methodologically this investigation offers a pipeline for flexibly estimating causal effects with observational, longitudinal, geographically located data.

More →


Ishita Ahmed, Masha Bertling, Lijin Zhang, Andrew D. Ho, Prashant Loyalka, Hao Xue, Scott Rozelle, Benjamin W. Domingue.

Researchers use test outcomes to evaluate the effectiveness of education interventions across numerous randomized controlled trials (RCTs). Aggregate test data—for example, simple measures like the sum of correct responses—are compared across treatment and control groups to determine whether an intervention has had a positive impact on student achievement. We show that item-level data and psychometric analyses can provide information about treatment heterogeneity and improve design of future experiments. We apply techniques typically used in the study of Differential Item Functioning (DIF) to examine variation in the degree to which items show treatment effects. That is, are observed treatment effects due to generalized gains on the aggregate achievement measures or are they due to targeted gains on specific items? Based on our analysis of 7,244,566 item responses (265,732 students responding to 2,119 items) taken from 15 RCTs in low-and-middle-income countries, we find clear evidence for variation in gains across items. DIF analyses identify items that are highly sensitive to the interventions—in one extreme case, a single item drives nearly 40% of the observed treatment effect—as well as items that are insensitive. We also show that the variation of item-level sensitivity can have implications for the precision of effect estimates. Of the RCTs that have significant effect estimates, 41% have patterns of item-level sensitivity to treatment that allow for the possibility of a null effect when this source of uncertainty is considered. Our findings demonstrate how researchers can gain more insight regarding the effects of interventions via additional analysis of item-level test data.

More →


Anjali Adukia, Alex Eble, Emileigh Harrison, Hakizumwami Birali Runesha, Teodora Szasz.

Books shape how children learn about society and norms, in part through representation of different characters. We introduce new artificial intelligence methods for systematically converting images into data and apply them, along with text analysis methods, to measure the representation of skin color, race, gender, and age in award-winning children’s books widely read in homes, classrooms, and libraries over the last century. We find that more characters with darker skin color appear over time, but the most influential books persistently depict characters with lighter skin color, on average, than other books, even after conditioning on race; we also find that children are depicted with lighter skin than adults on average. Relative to their growing share of the U.S. population, Black and Latinx people are underrepresented in these same books, while White males are overrepresented. Over time, females are increasingly present but appear less often in text than in images, suggesting greater symbolic inclusion in pictures than substantive inclusion in stories. We then present analysis of the supply of, and demand for, books with different levels of representation to better understand the economic behavior that may contribute to these patterns. On the demand side, we show that people consume books that center their own identities. On the supply side, we document higher prices for books that center non-dominant social identities and fewer copies of these books in libraries that serve predominantly White communities. Lastly, we show that the types of children's books purchased in a neighborhood are related to local political beliefs.

More →


Heewon Jang, Richard W. Disalvo.

How progressive is school spending when spending is measured at the school-level, instead of the district-level? We use the first dataset on school-level spending across schools throughout the United States to ask to what extent progressivity patterns previously examined across districts are amplified, nullified, or reversed, upon disaggregation to schools. We find that progressivity is systematically greater when we conduct a school-level analysis, rather than district-level analysis. This may be surprising, given the traditional view in public economics that local governments cannot effectively redistribute. We thus probe the data for explanations for this pattern, uncovering evidence that federal policies play an important role in driving within-district progressive allocations. In particular, we can explain about 83% of the within-district contribution to progressivity by the federal component of spending plus allocations that are empirically attributable to special education and English language learning programs. Our findings are thus consistent with the traditional view of redistribution being primarily the purview of central governments, operationalized in this context through mandates.

More →


Agustina S. Paglayan, Anja Neundorf, Wooseok Kim.

Challenging the conventional wisdom that the spread of democracy was a leading driver of the expansion of primary schooling, recent studies show that democratization in fact did not lead to an average increase in primary school enrollment rates. One reason for this null effect is that there was already considerable provision of primary education before democratization. Still, it is possible that the spread of democracy did impact other aspects of education systems, such as the content of education and the extent to which teaching jobs are politicized. Studying this possibility cross-nationally has been infeasible due to data limitations. To address this gap, we take advantage of an original dataset covering 160 countries from 1945 to 2021 that contains information about these aspects of education. We document that transitions to democracy tend to be preceded by a decline in the politicization of both education content and teaching jobs. However, soon after democratization occurs, this decline usually halts. Counterfactual estimates suggest that democratization roughly halves the degree to which teacher hiring and firing decisions are politicized, but has a smaller impact on the content of education. The empirical patterns that we uncover have important implications for future research.

More →


Elizabeth Huffaker, Sarah Novicoff, Thomas S. Dee.

A controversial, equity-focused mathematics reform in the San Francisco Unified School District (SFUSD) featured delaying Algebra I until ninth grade for all students. This descriptive study examines student-level longitudinal data on mathematics course-taking across successive cohorts of SFUSD students who spanned the reform’s implementation. We observe large changes in ninth and tenth grades (e.g., delaying Algebra I and Geometry). Participation in Advanced Placement (AP) math initially fell 15% (6 pp.) driven by declines in AP Calculus and among Asian/Pacific-Islander students. However, growing participation in acceleration options attenuated these reductions. Large ethnoracial gaps in advanced math course-taking remained.

More →


Robert C. Carr, Margaret Burchinal, Lynne Vernon-Feagans.

A systematic review of the literature (1965–2022) and meta-analysis were undertaken to compare the school readiness skills of children participating in public pre-kindergarten (pre-K) or Head Start. Seven quasi-experimental studies met the inclusion criteria for the meta-analysis and 38 effect sizes were analyzed. Results indicated no reliable meta-analytic effect in relation to children’s school readiness skills overall nor in relation to language, mathematics, or social-behavioral skills specifically. A small, positive meta-analytic effect favoring public pre-K compared to Head Start participation was found in relation to children’s emergent literacy skills (Hedges’ g = 0.17). Strategies are discussed to further equate the benefits of public pre-K and Head Start programming by facilitating greater cross-sector collaboration.

More →


Alessandro Castagnetti, Derek Rury.

We administer a survey to study students' preferences for relative performance feedback in an introductory economics class. To do so, we elicit students' willingness to pay for/avoid learning their rank on a midterm exam. Our results show that 10% of students are willing to pay to avoid learning their rank. We also find that female students are willing to pay $1 more than male students. We also confirm that beliefs about academic performance do not predict preferences for information. After randomizing access to information about rank, students report needing more study hours to achieve their desired grade and being less likely in the top half of the ability distribution in the class. These effects are driven by stronger effects from people who overestimated their midterm rank compared to those who underestimated their performance. We do not find an overall effect of learning about rank performance on final course grade. We also confirm that students' preferences for feedback do not interfere with their belief updating.

More →


Andrew Kwok, Joseph Waddington, Jenna Davis, Sara Halabi, Debbee Huston, Rita Hemsley.

Our study examines roughly 2,000 novice teachers’ responses about how they account for students’ cultural, ethnic/racial, and linguistic diversity. We qualitatively analyze robust open-ended survey responses to explore teachers’ reported strategies for how they integrate asset-based pedagogy (ABP). We identify codes related to these strategies and then investigate them by participant demographics. This illuminates both the predictive validity of our qualitative analyses as well as provides initial evidence as to whether certain characteristics are associated with critical techniques. Our findings inform practitioners of a suite of ABP strategies as well as districts and policymakers about how novice teachers are processing asset-based instruction and who to target support in this vital pedagogical area.

More →


Hernando Grueso.

Given the spike of homicides in conflict zones of Colombia after the 2016 peace agreement, I study the causal effect of violence on college test scores. Using a difference-in-difference design with heterogeneous effects, I show how this increase in violence had a negative effect on college learning, and how this negative effect is mediated by factors such as poverty, college major, degree type, and study mode. A 10% increase in the homicide rate per 100,000 people in conflict zones of Colombia, had a negative impact on college test scores equivalent to 0.07 standard deviations in the English section of the test. This negative effect is larger in the case of poor and female students who saw a negative effect of approximately 0.16 standard deviations, equivalent to 3.4 percentage points out of the final score. Online and short-cycle students suffer a larger negative effect of 0.14 and 0.19 standard deviations respectively. This study provides among the first evidence of the negative effect of armed conflict on college learning and offers policy recommendations based on the heterogeneous effects of violence.

More →