Search EdWorkingPapers

Search EdWorkingPapers by author, title, or keywords.

Program and policy effects

Sara White, Leiah Groom-Thomas, Susanna Loeb.

This study synthesizes existing research on the implementation of tutoring programs which we define as one-to-one or small-group instruction in which a human tutor supports students grades K-12 in an academic subject area. Tutoring has emerged as an especially promising strategy for supporting students’ academic success with strong causal evidence finding large, positive effects on students' math and reading test scores across grade levels. Prior studies have reviewed this causal evidence of effects, but none have summarized the evidence on implementation. We iteratively developed search and selection criteria to identify studies addressing key research questions and synthesized these 40 studies which employ a range of research methodologies to describe how tutoring is implemented and experienced. We find that existing research provides rich descriptions of tutoring implementation within specific programs of focus, with most studies describing after-school tutoring and small-scale programs run by university professors. While few elements of implementation are studied in depth across multiple studies, common patterns emerge. Tutoring program launch is often facilitated by strategic relationships between schools and external tutoring providers and strengthened by transparent assessments of program quality and effectiveness. Successful tutoring implementation often hinges on the support of key school leaders with the power to direct the use of school funding, space, and time. Tutoring setting and schedule, tutor recruitment and training, and curriculum identification influence whether students are able to access tutoring services and the quality of the instruction provided. Ultimately, the evidence points to strong tutoring being driven by positive student-tutor relationships through which tutors provide instruction strategically targeted for students’ strengths and needs driving towards a long-term academic goal.

More →


Matthew A. Kraft, Sarah Novicoff.

Policymakers have renewed calls for expanding instructional time in the wake of the COVID-19 pandemic. We establish a set of empirical facts about time in school, synthesize the literature on the causal effects of instructional time, and conduct a case study of time use in an urban district. On average, instructional time in U.S. public schools is comparable to most high-income countries, with longer days but shorter years. However, instructional time varies widely across U.S. public schools with a 90th-10th percentile difference of 190 total hours. Empirical literature confirms that additional time can increase student achievement, but how this time is structured matters. Our case study suggests schools might also recover substantial lost learning time within the existing school day.

More →


Christina Weiland, Meghan McCormick, Jennifer Duer, Allison Friedman-Kraus, Mirjana Pralica, Samantha Xia, Milagros Nores, Shira Mattera.

Nearly all states with public prekindergarten programs use mixed-delivery systems, with classrooms in both public schools and community-based settings.  However, experts have long raised concerns about systematic inequities by setting within these public systems.  We used data from five large-scale such systems that have taken steps to improve equity by setting  (Boston, New York City, Seattle, New Jersey, and West Virginia) to conduct the most comprehensive descriptive study of prekindergarten setting differences to date. Our public school sample included 2,395 children in 383 classrooms in 152 schools, while our community-based sample is comprised of 1,541 children in 201 classrooms in 103 community-based organizations (CBOs).  We examined how child and teacher demographic characteristics, structural and process quality features, and child gains differed by setting within each of these systems.  We found evidence of sorting of children and teachers by setting within each locality, including of children with higher baseline skills and more educated teachers into public schools. Where there were differences in quality and children’s gains, these tended to favor public schools.  The localities with fewer policy differences by setting – NJ and Seattle – showed fewer differences in quality and child gains.  Our findings suggest that inequities by setting are common, appear consequential, and deserve more research and policy attention.  

More →


Kai Hong, Syeda Sana Fatima, Sherry Glied, Leanna Stiefel, Amy Ellen Schwartz.

There is increasing concern about risky behaviors and poor mental health among school-aged youth. A critical factor in youth well-being is school attendance. This study evaluates how school organization and structure affect health outcomes by examining the impacts of a popular urban high school reform -- “small schools” -- on youth risky behaviors and mental health, using data from New York City. To estimate a causal estimate of attending small versus large high schools, we use a two-sample-instrumental-variable approach with the distance between student residence and school as the instrument for school enrollment. We consider two types of small schools – “old small schools,” which opened prior to a system-wide 2003 reform aimed at increasing educational achievement and “new small schools,” which opened in the wake of that reform. We find that girls enrolled in older small schools are less likely to become pregnant, and boys are less likely to be diagnosed with mental health disorders than their counterparts in large schools. Both girls and boys enrolled in more recently opened small schools, however, are more likely to be diagnosed with violence-associated injuries and (for girls only) with mental health disorders. These disparate results suggest that improving a school’s organization and inputs together is likely more effective in addressing youth risky behaviors than simply reducing school size.

More →


Ann Mantil, John Papay, Preeya Pandya Mbekeani, Richard J. Murnane.

Preparing K-12 students for careers in science, technology, engineering and mathematics (STEM) fields is an ongoing challenge confronting state policymakers. We examine the implementation of a science graduation testing requirement for high-school students in Massachusetts, beginning with the graduating class of 2010. We find that the design of the new requirement was quite complicated, reflecting the state’s previous experiences with test-based accountability, a broad consensus on policy goals among key stakeholders, and the desire to afford flexibility to local schools and districts. The consequences for both students and schools, while largely consistent with the goals of increasing students’ skills and interest in STEM fields, were in many cases unexpected. We find large differences by demographic subgroup in the probabilities of passing the first science exam and of succeeding on retest, even when conditioning on previous test-score performance. Our results also show impacts of science exit-exam performance for students scoring near the passing threshold, particularly on the high-school graduation rates of females and on college outcomes for higher-income students. These findings demonstrate the importance of equity considerations in designing and evaluating ambitious new policy initiatives.

More →


Ishtiaque Fazlul, Cory Koedel, Eric Parsons.

Measures of student disadvantage—or risk—are critical components of equity-focused education policies. However, the risk measures used in contemporary policies have significant limitations, and despite continued advances in data infrastructure and analytic capacity, there has been little innovation in these measures for decades. We develop a new measure of student risk for use in education policies, which we call Predicted Academic Performance (PAP). PAP is a flexible, data-rich indicator that identifies students at risk of poor academic outcomes. It blends concepts from emerging early warning systems with principles of incentive design to balance the competing priorities of accurate risk measurement and suitability for policy use. In proof-of-concept policy simulations using data from Missouri, we show PAP is more effective than common alternatives at identifying students who are at risk of poor academic outcomes and can be used to target resources toward these students—and students who belong to several other associated risk categories—more efficiently.

More →


Benjamin W. Arold.

Anti-scientific attitudes can impose substantial costs on societies. Can schools be an important agent in mitigating the propagation of such attitudes? This paper investigates the effect of the content of science education on anti-scientific attitudes, knowledge, and choices. The analysis exploits staggered reforms that reduce or expand the coverage of evolution theory in US state science education standards. I compare adjacent cohorts in models with state and cohort fixed effects and conduct fine-grained placebo tests to rule out scientific, religious and political confounders. There are three main results. First, expanded evolution coverage increases students’ knowledge about evolution. Second, the reforms translate into greater evolution belief in adulthood, but do not crowd out religiosity or affect political attitudes. Third, the reforms affect high-stakes life decisions, namely the probability of working in life sciences.

More →


Kate Antonovics, Sandra E. Black, Julie Berry Cullen, Akiva Yonah Meiselman.

Schools often track students to classes based on ability. Proponents of tracking argue it is a low-cost tool to improve learning since instruction is more effective when students are more homogeneous, while opponents argue it exacerbates initial differences in opportunities without strong evidence of efficacy. In fact, little is known about the pervasiveness or determinants of ability tracking in the US. To fill this gap, we use detailed administrative data from Texas to estimate the extent of tracking within schools for grades 4 through 8 over the years 2011-2019. We find substantial tracking; tracking within schools overwhelms any sorting by ability that takes place across schools. The most important determinant of tracking is heterogeneity in student ability, and schools operationalize tracking through the classification of students into categories such as gifted and disabled and curricular differentiation. When we examine how tracking changes in response to educational policies, we see that schools decrease tracking in response to accountability pressures. Finally, when we explore how exposure to tracking correlates with student mobility in the achievement distribution, we find positive effects on high-achieving students with no negative effects on low-achieving students, suggesting that tracking may increase inequality by raising the ceiling.

More →


Shaun M. Dougherty, Mary M. Smith.

Career and technical education (CTE) has existed in the United States for over a century, and only in recent years have there been opportunities to assess the causal impact of participating in these programs while in high school. To date, no work has assessed whether the relative costs of these programs meet or exceed the benefits as described in recent evaluations. In this paper, we use available cost data to compare average costs per pupil in standalone high school CTE programs in Connecticut and Massachusetts to the most likely counterfactual schools. Under a variety of conservative assumptions about the monetary value of known educational and social benefits, we find that programs in Massachusetts offer clear positive returns on investment, whereas programs in Connecticut offer smaller, though mostly non-negative expected returns. We also consider the potential cost effectiveness of CTE programs offered in other contexts to address questions of generalizability.  

More →


Jonathan L. Presler.

Using daily lunch transaction data from NYC public schools, I determine which students frequently stand next to one another in the lunch line. I use this `revealed' friendship network to estimate academic peer effects in elementary school classrooms, improving on previous work by defining not only where social connections exist, but the relative strength of these connections. Equally weighting all peers in a reference group assumes that all peers are equally important and may bias estimates by underweighting important peers and overweighting unimportant peers. I find that students who eat together are important influencers of one another's academic performance, with stronger effects in math than in reading. Further exploration of the mechanisms supports my claim that these are friendship networks. I also compare the influence of friends from different periods in the school year and find that connections occurring around standardized testing dates are most influential on test scores.

More →