Search EdWorkingPapers

Search EdWorkingPapers by author, title, or keywords.

EdWorkingPapers

Matthew A. Kraft, Sarah Novicoff.

Policymakers have renewed calls for expanding instructional time in the wake of the COVID-19 pandemic. We establish a set of empirical facts about time in school, synthesize the literature on the causal effects of instructional time, and conduct a case study of time use in an urban district. On average, instructional time in U.S. public schools is comparable to most high-income countries, with longer days but shorter years. However, instructional time varies widely across U.S. public schools with a 90th-10th percentile difference of 190 total hours. Empirical literature confirms that additional time can increase student achievement, but how this time is structured matters. Our case study suggests schools might also recover substantial lost learning time within the existing school day.

More →


Jonathan E. Collins.

The George Floyd Protests of the Summer of 2020 initiated public conversations around the need for antiracist teaching. Yet, over time the discussion evolved into policy debates around the use of Critical Race Theory in civics courses. The rapid transition masked the fact that we know little about Americans' policy preferences. Do Americans support antiracist teaching? What factors best explain support/opposition? How does critical race theory factor in? Using a series of original survey experiments, this study shows that Americans maintain strong support for antiracist teaching, but that support is drastically weakened when curriculum features the term "critical race theory."

More →


Christina Weiland, Meghan McCormick, Jennifer Duer, Allison Friedman-Kraus, Mirjana Pralica, Samantha Xia, Milagros Nores, Shira Mattera.

Nearly all states with public prekindergarten programs use mixed-delivery systems, with classrooms in both public schools and community-based settings.  However, experts have long raised concerns about systematic inequities by setting within these public systems.  We used data from five large-scale such systems that have taken steps to improve equity by setting  (Boston, New York City, Seattle, New Jersey, and West Virginia) to conduct the most comprehensive descriptive study of prekindergarten setting differences to date. Our public school sample included 2,395 children in 383 classrooms in 152 schools, while our community-based sample is comprised of 1,541 children in 201 classrooms in 103 community-based organizations (CBOs).  We examined how child and teacher demographic characteristics, structural and process quality features, and child gains differed by setting within each of these systems.  We found evidence of sorting of children and teachers by setting within each locality, including of children with higher baseline skills and more educated teachers into public schools. Where there were differences in quality and children’s gains, these tended to favor public schools.  The localities with fewer policy differences by setting – NJ and Seattle – showed fewer differences in quality and child gains.  Our findings suggest that inequities by setting are common, appear consequential, and deserve more research and policy attention.  

More →


Kai Hong, Syeda Sana Fatima, Sherry Glied, Leanna Stiefel, Amy Ellen Schwartz.

There is increasing concern about risky behaviors and poor mental health among school-aged youth. A critical factor in youth well-being is school attendance. This study evaluates how school organization and structure affect health outcomes by examining the impacts of a popular urban high school reform -- “small schools” -- on youth risky behaviors and mental health, using data from New York City. To estimate a causal estimate of attending small versus large high schools, we use a two-sample-instrumental-variable approach with the distance between student residence and school as the instrument for school enrollment. We consider two types of small schools – “old small schools,” which opened prior to a system-wide 2003 reform aimed at increasing educational achievement and “new small schools,” which opened in the wake of that reform. We find that girls enrolled in older small schools are less likely to become pregnant, and boys are less likely to be diagnosed with mental health disorders than their counterparts in large schools. Both girls and boys enrolled in more recently opened small schools, however, are more likely to be diagnosed with violence-associated injuries and (for girls only) with mental health disorders. These disparate results suggest that improving a school’s organization and inputs together is likely more effective in addressing youth risky behaviors than simply reducing school size.

More →


Kelli A. Bird, Benjamin L. Castleman, Yifeng Song, Renzhe Yu.

Data science applications are increasingly entwined in students’ educational experiences. One prominent application of data science in education is to predict students’ risk of failing a course in or dropping out from college. There is growing interest among higher education researchers and administrators in whether learning management system (LMS) data, which capture very detailed information on students’ engagement in and performance on course activities, can improve model performance. We systematically evaluate whether incorporating LMS data into course performance prediction models improves model performance. We conduct this analysis within an entire state community college system. Among students with prior academic history in college, administrative data-only models substantially outperform LMS data-only models and are quite accurate at predicting whether students will struggle in a course. Among first-time students, LMS data-only models outperform administrative data-only models. We achieve the highest performance for first-time students with models that include data from both sources. We also show that models achieve similar performance with a small and judiciously selected set of predictors; models trained on system-wide data achieve similar performance as models trained on individual courses.

More →


Olivia L. Chi, Matthew A. Lenard.

Improving teacher selection is an important strategy for strengthening the quality of the teacher workforce. As districts adopt commercial teacher screening tools, evidence is needed to understand these tools’ predictive validity. We examine the relationship between Frontline Education’s TeacherFit instrument and newly hired teachers’ outcomes. We find that a one SD increase on an index of TeacherFit scores is associated with a 0.06 SD increase in evaluation scores. However, we also find evidence that teachers with higher TeacherFit scores are more likely to leave their hiring schools the following year. Our results suggest that TeacherFit is not necessarily a substitute for more rigorous screening processes that are conducted by human resources officials, such as those documented in recent studies.

More →


Jackie Eunjung Relyea, Patrick Rich, James S. Kim, Joshua B. Gilbert.

The current study aimed to explore the COVID-19 impact on the reading achievement growth of Grade 3-5 students in a large urban school district in the U.S. and whether the impact differed by students’ demographic characteristics and instructional modality. Specifically, using administrative data from the school district, we investigated to what extent students made gains in reading during the 2020-2021 school year relative to the pre-COVID-19 typical school year in 2018-2019. We further examined whether the effects of students’ instructional modality on reading growth varied by demographic characteristics. Overall, students had lower average reading achievement gains over the 9-month 2020-2021 school year than the 2018-2019 school year with a learning loss effect size of 0.54, 0.27, and 0.28 standard deviation unit for Grade 3, 4, and 5, respectively. Substantially reduced reading gains were observed from Grade 3 students, students from high-poverty backgrounds, English learners, and students with reading disabilities. Additionally, findings indicate that among students with similar demographic characteristics, higher-achieving students tended to choose the fully remote instruction option, while lower-achieving students appeared to opt for in-person instruction at the beginning of the 2020-2021 school year. However, students who received in-person instruction most likely demonstrated continuous growth in reading over the school year, whereas initially higher-achieving students who received remote instruction showed stagnation or decline, particularly in the spring 2021 semester. Our findings support the notion that in-person schooling during the pandemic may serve as an equalizer for lower-achieving students, particularly from historically marginalized or vulnerable student populations.

More →


Seth Gershenson, Stephen B. Holt, Adam Tyner.

Teachers are among the most important inputs in the education production function. One mechanism by which teachers might affect student learning is through the grading standards they set for their classrooms. However, the effects of grading standards on student outcomes are relatively understudied. Using administrative data that links individual students and teachers in 8th and 9th grade Algebra I classrooms from 2006 to 2016, we examine the effects of teachers’ grading standards on student learning and attendance. High teacher grading standards in Algebra I increase student learning both in Algebra I and in subsequent math classes. The effect on student achievement is positive and similar in size across student characteristics and levels of ability, students’ relative rank within the classroom, and school context. High teacher grading standards also lead to a modest reduction in student absences.

More →


Ann Mantil, John Papay, Preeya Pandya Mbekeani, Richard J. Murnane.

Preparing K-12 students for careers in science, technology, engineering and mathematics (STEM) fields is an ongoing challenge confronting state policymakers. We examine the implementation of a science graduation testing requirement for high-school students in Massachusetts, beginning with the graduating class of 2010. We find that the design of the new requirement was quite complicated, reflecting the state’s previous experiences with test-based accountability, a broad consensus on policy goals among key stakeholders, and the desire to afford flexibility to local schools and districts. The consequences for both students and schools, while largely consistent with the goals of increasing students’ skills and interest in STEM fields, were in many cases unexpected. We find large differences by demographic subgroup in the probabilities of passing the first science exam and of succeeding on retest, even when conditioning on previous test-score performance. Our results also show impacts of science exit-exam performance for students scoring near the passing threshold, particularly on the high-school graduation rates of females and on college outcomes for higher-income students. These findings demonstrate the importance of equity considerations in designing and evaluating ambitious new policy initiatives.

More →


Ishtiaque Fazlul, Cory Koedel, Eric Parsons.

Measures of student disadvantage—or risk—are critical components of equity-focused education policies. However, the risk measures used in contemporary policies have significant limitations, and despite continued advances in data infrastructure and analytic capacity, there has been little innovation in these measures for decades. We develop a new measure of student risk for use in education policies, which we call Predicted Academic Performance (PAP). PAP is a flexible, data-rich indicator that identifies students at risk of poor academic outcomes. It blends concepts from emerging “early warning” systems with principles of incentive design to balance the competing priorities of accurate risk measurement and suitability for policy use. PAP is more effective than common alternatives at identifying students who are at risk of poor academic outcomes and can be used to target resources toward these students—and students who belong to several other associated risk categories—more efficiently.

More →