Search for EdWorkingPapers here by author, title, or keywords.
Methodology, measurement and data
A huge portion of what we know about how humans develop, learn, behave, and interact is based on survey data. Researchers use longitudinal growth modeling to understand the development of students on psychological and social-emotional learning constructs across elementary and middle school. In these designs, students are typically administered a consistent set of self-report survey items across multiple school years, and growth is measured either based on sum scores or scale scores produced based on item response theory (IRT) methods. While there is great deal of guidance on scaling and linking IRT-based large-scale educational assessment to facilitate the estimation of examinee growth, little of this expertise is brought to bear in the scaling of psychological and social-emotional constructs. Through a series of simulation and empirical studies, we produce scores in a single-cohort repeated measure design using sum scores as well as multiple IRT approaches and compare the recovery of growth estimates from longitudinal growth models using each set of scores. Results indicate that using scores from multidimensional IRT approaches that account for latent variable covariances over time in growth models leads to better recovery of growth parameters relative to models using sum scores and other IRT approaches.
Survey respondents use different response styles when they use the categories of the Likert scale differently despite having the same true score on the construct of interest. For example, respondents may be more likely to use the extremes of the response scale independent of their true score. Research already shows that differing response styles can create a construct-irrelevant source of bias that distorts fundamental inferences made based on survey data. While some initial studies examine the effect of response styles on survey scores in longitudinal analyses, the issue of how response styles affect estimates of growth is underexamined. In this study, we conducted empirical and simulation analyses in which we scored surveys using item response theory (IRT) models that do and do not account for response styles, and then used those different scores in growth models and compared results. Generally, we found that response styles can affect estimates of growth parameters including the slope, but that the effects vary by psychological construct, response style, and model used.
This report reviews findings from 35 major studies that speak to the question of principal turnover. Within these studies, researchers have examined principal turnover nationally and within states and districts, primarily investigating the relationships between principal turnover and various characteristics of principals, schools, students, and policies. While there is some consistency across studies, there is a good deal of variation in research questions, methods, and measurement of turnover. Further, few studies consider all the possible pathways out of the principalship, and few isolate the ways in which specific conditions or features of the principalship impact principals’ decisions to leave or districts’ decisions to retain principals. Despite these limitations, we found that, when examined together, these studies provided important information to help policymakers, education leaders, and other stakeholders understand and address principal turnover.
Much is known about how to attract, develop, and retain a strong and stable teacher workforce, and states across the country are taking action to address their teacher shortages in ways that strengthen their overall teacher workforce. This report highlights research on six evidence-based policies that have been used to address teacher shortages and boost teacher recruitment and retention: service scholarships and loan forgiveness, high-retention pathways into teaching, mentoring and induction for new teachers, developing high-quality school principals, competitive compensation, and recruitment policies to expand the pool of qualified educators.
Research showing that high-quality preschool benefits children’s early learning and later life outcomes has led to increased state engagement in public preschool. However, mixed results from evaluations of two programs—Tennessee’s Voluntary Pre-K program and Head Start—have left many policymakers unsure about how to ensure productive investments. This report presents the most rigorous evidence on the effects of preschool and clarifies how the findings from Tennessee and Head Start relate to the larger body of research showing that high-quality preschool enhances children’s school readiness by supporting substantial early learning gains in comparison to children who do not experience preschool and can have lasting impacts far into children’s later years of school and life. Therefore, the issue is not whether preschool “works,” but how to design and implement programs that ensure public preschool investments consistently deliver on their promise.
Estimates of teacher “value-added” suggest teachers vary substantially in their ability to promote student learning. Prompted by this finding, many states and school districts have adopted value-added measures as indicators of teacher job performance. In this paper, we conduct a new test of the validity of value-added models. Using administrative student data from New York City, we apply commonly estimated value-added models to an outcome teachers cannot plausibly affect: student height. We find the standard deviation of teacher effects on height is nearly as large as that for math and reading achievement, raising obvious questions about validity. Subsequent analysis finds these “effects” are largely spurious variation (noise), rather than bias resulting from sorting on unobserved factors related to achievement. Given the difficulty of differentiating signal from noise in real-world teacher effect estimates, this paper serves as a cautionary tale for their use in practice.
Despite wide achievement gaps across California between students from different racial and socioeconomic backgrounds, some school districts have excelled at supporting the learning of all their students. This analysis identifies these positive outlier districts—those in which students of color, as well as White students, consistently achieve at higher levels than students from similar racial/ethnic backgrounds and from families of similar income and education levels in most other districts. These results are predicted, in significant part, by the qualifications of districts’ teachers, as measured by their certification and experience. In particular, the proportion of underprepared teachers—those teaching on emergency permits, waivers, and intern credentials—is associated with decreased achievement for all students, while teaching experience is associated with increased achievement, especially for students of color.
We show that grit, a skill that has been shown to be highly predictive of achievement, is malleable in childhood and can be fostered in the classroom environment. We evaluate a randomized educational intervention implemented in two independent elementary school samples. Outcomes are measured via a novel incentivized real effort task and performance in standardized tests. We find that treated students are more likely to exert effort to accumulate task-specific ability, and hence, more likely to succeed. In a follow up 2.5 years after the intervention, we estimate an effect of about 0.2 standard deviations on a standardized math test.
Despite large schooling and learning gains in many developing countries, children in highly deprived areas are often unlikely to achieve even basic literacy and numeracy. We study how much of this problem can be resolved using a multi-pronged intervention combining several distinct interventions known to be effective in isolation. We conducted a cluster-randomized trial in The Gambia evaluating a literacy and numeracy intervention designed for primary-aged children in remote parts of poor countries. The intervention combines para teachers delivering after-school supplementary classes, scripted lesson plans, and frequent monitoring focusing on improving teacher practice (coaching). A similar intervention previously demonstrated large learning gains in a cluster-randomized trial in rural India. After three academic years, Gambian children receiving the intervention scored 46 percentage points (3.2 SD) better on a combined literacy and numeracy test than control children. This intervention holds great promise to address low learning levels in other poor, remote settings.
Up to three-fourths of college students can be classified as “non-traditional”, yet whether typical policy interventions improves their education and labor market outcomes is understudied. I use a regression discontinuity design to estimate the impacts of a state financial aid program aimed towards non-traditional students. Eligibility has no impacts on degree completion for students intending to enroll in community colleges or four-year colleges but increases bachelor’s degrees for students interested in large, for-profit colleges by four percentage points. I find no impacts on employment or earnings for all applicants. This research highlights challenges in promoting human capital investment for adults.