Books shape how children learn about society and norms, in part through representation of different characters. We introduce new artificial intelligence methods for systematically converting images into data and apply them, along with text analysis methods, to measure the representation of skin color, race, gender, and age in award-winning children’s books widely read in homes, classrooms, and libraries over the last century. We find that more characters with darker skin color appear over time, but the most influential books persistently depict characters with lighter skin color, on average, than other books, even after conditioning on race; we also find that children are depicted with lighter skin than adults on average. Relative to their growing share of the U.S. population, Black and Latinx people are underrepresented in these same books, while White males are overrepresented. Over time, females are increasingly present but appear less often in text than in images, suggesting greater symbolic inclusion in pictures than substantive inclusion in stories. We then present analysis of the supply of, and demand for, books with different levels of representation to better understand the economic behavior that may contribute to these patterns. On the demand side, we show that people consume books that center their own identities. On the supply side, we document higher prices for books that center non-dominant social identities and fewer copies of these books in libraries that serve predominantly White communities. Lastly, we show that the types of children's books purchased in a neighborhood are related to local political beliefs.
representation, images as data, curriculum, children, education, libraries, race, gender
Document Object Identifier (DOI)