Search EdWorkingPapers

Search EdWorkingPapers by author, title, or keywords.

Standards, accountability, assessment, and curriculum

Lauren Sartain, Silvana Freire, John Q. Easton, Briana Diaz.

Across an array of educational outcomes, evidence suggests that girls outperform boys on average. For example, in Chicago, ninth-grade girls earn math GPAs that are 0.29 points higher than boys on average. This paper examines explanations for this gap, such as girl-boy differences in academic preparation, behaviors and habits, and experiences in math classes. After accounting for these factors, the gender gap in math grades persists. We, then, examine the classroom-level conditions that reduce the gender gap in grades. The gap is smaller in more advanced courses like honors classes and geometry. Further, boys perform more similarly to girls in classes with male teachers. These findings highlight classroom conditions that are more conducive to the academic success of boys.

More →


Kirsten Slungaard Mumma.

Most public schools have a library on site, but little is known about the quality or content of school library programs. I use web-scraping techniques to collect original data on hundreds of titles in over 6,600 school libraries to identify patterns in library resources and content. Three primary findings emerge. First, gaps exist in library resources and collection quality, particularly between schools in low- and high-income areas. Second, although books with “controversial content” are widely available, the prevalence of these titles is related to local politics, state laws, and social environments. Libraries in conservative areas are less likely to have books that deal with LGBTQ+ issues, race/racism, or abortion and more likely to have discontinued Dr. Seuss and Christian fiction titles. Third, book challenges in the 2021-22 school year have had “chilling effects” on the acquisition of new LGBTQ+ content.

More →


NaYoung Hwang, Cory Koedel.

We evaluate the effects of grade retention on students’ academic, attendance, and disciplinary outcomes in Indiana. Using a regression discontinuity design, we show that third grade retention increases achievement in English Language Arts (ELA) and math immediately and substantially, and the effects persist into middle school. We find no evidence of grade retention effects on student attendance or disciplinary incidents, again into middle school. Our findings combine to show that Indiana’s third grade retention policy improves achievement for retained students without adverse impacts along (measured) non-academic dimensions.

More →


Mark Murphy, Angela Johnson.
This study examines the effects of English Learner (EL) status on subsequent Special Education (SPED) placement. Through a research-practice partnership, we link student demographic data and initial English proficiency assessment data across seven cohorts of test takers and observe EL and SPED programmatic participation for these students over seven years. Our regression discontinuity (RD) estimates at the English proficiency margin consistently differ substantively from positive associations generated through regression analyses. RD evidence indicates that EL status had no effect on SPED placement at the English proficiency threshold. Grade-by-grade and subgroup RD analyses at this margin suggest that ELs were modestly under-identified for SPED during grade 5 and that ELs whose primary language was Spanish were under-identified for SPED.

More →


Aaron Phipps, Alexander Amaya.

Given the simultaneous rise in time-to-graduation and college GPA, it may be that students reduce their course load to improve their performance. Yet, evidence to date only shows increased course loads increase GPA. We provide a mathematical model showing many unobservable factors -- beyond student ability -- can generate a positive relationship between course load and GPA unless researchers control student schedules. West Point regularly implements the ideal experiment by randomly modifying student schedules with additional training courses. Using 19 years of administrative data, we provide the first causal evidence that taking more courses reduces GPA and increases course failure rates, sometimes substantially.

More →


Bobby W. Chung, Jian Zou.

States increasingly require prospective teachers to pass exams for program completion and initial licensure, including the recent controversial roll-out of the educative Teacher Performance Assessment (edTPA). We leverage the quasi-experimental setting of different adoption timing by states and analyze multiple data sources containing a national sample of prospective teachers and students of new teachers in the US. With extensive controls of concurrent policies, we find that the edTPA reduced prospective teachers in traditional route programs, less-selective and minority-concentrated universities. Contrary to the policy intention, we do not find evidence that edTPA increased student test scores.

More →


Jonathan E. Collins.

The George Floyd Protests of the Summer of 2020 initiated public conversations around the need for antiracist teaching. Yet, over time the discussion evolved into policy debates around the use of Critical Race Theory in civics courses. The rapid transition masked the fact that we know little about Americans' policy preferences. Do Americans support antiracist teaching? What factors best explain support/opposition? How does critical race theory factor in? Using a series of original survey experiments, this study shows that Americans maintain strong support for antiracist teaching, but that support is drastically weakened when curriculum features the term "critical race theory."

More →


Ann Mantil, John Papay, Preeya Pandya Mbekeani, Richard J. Murnane.

Preparing K-12 students for careers in science, technology, engineering and mathematics (STEM) fields is an ongoing challenge confronting state policymakers. We examine the implementation of a science graduation testing requirement for high-school students in Massachusetts, beginning with the graduating class of 2010. We find that the design of the new requirement was quite complicated, reflecting the state’s previous experiences with test-based accountability, a broad consensus on policy goals among key stakeholders, and the desire to afford flexibility to local schools and districts. The consequences for both students and schools, while largely consistent with the goals of increasing students’ skills and interest in STEM fields, were in many cases unexpected. We find large differences by demographic subgroup in the probabilities of passing the first science exam and of succeeding on retest, even when conditioning on previous test-score performance. Our results also show impacts of science exit-exam performance for students scoring near the passing threshold, particularly on the high-school graduation rates of females and on college outcomes for higher-income students. These findings demonstrate the importance of equity considerations in designing and evaluating ambitious new policy initiatives.

More →


Ishtiaque Fazlul, Cory Koedel, Eric Parsons.

Measures of student disadvantage—or risk—are critical components of equity-focused education policies. However, the risk measures used in contemporary policies have significant limitations, and despite continued advances in data infrastructure and analytic capacity, there has been little innovation in these measures for decades. We develop a new measure of student risk for use in education policies, which we call Predicted Academic Performance (PAP). PAP is a flexible, data-rich indicator that identifies students at risk of poor academic outcomes. It blends concepts from emerging “early warning” systems with principles of incentive design to balance the competing priorities of accurate risk measurement and suitability for policy use. PAP is more effective than common alternatives at identifying students who are at risk of poor academic outcomes and can be used to target resources toward these students—and students who belong to several other associated risk categories—more efficiently.

More →


Kate Antonovics, Sandra E. Black, Julie Berry Cullen, Akiva Yonah Meiselman.

Schools often track students to classes based on ability. Proponents of tracking argue it is a low-cost tool to improve learning since instruction is more effective when students are more homogeneous, while opponents argue it exacerbates initial differences in opportunities without strong evidence of efficacy. In fact, little is known about the pervasiveness or determinants of ability tracking in the US. To fill this gap, we use detailed administrative data from Texas to estimate the extent of tracking within schools for grades 4 through 8 over the years 2011-2019. We find substantial tracking; tracking within schools overwhelms any sorting by ability that takes place across schools. The most important determinant of tracking is heterogeneity in student ability, and schools operationalize tracking through the classification of students into categories such as gifted and disabled and curricular differentiation. When we examine how tracking changes in response to educational policies, we see that schools decrease tracking in response to accountability pressures. Finally, when we explore how exposure to tracking correlates with student mobility in the achievement distribution, we find positive effects on high-achieving students with no negative effects on low-achieving students, suggesting that tracking may increase inequality by raising the ceiling.

More →