September 2023 Bayesian nonparametric mixture modeling for temporal dynamics of gender stereotypes
Maria De Iorio, Stefano Favaro, Alessandra Guglielmi, Lifeng Ye
Ann. Appl. Stat. 17(3): 2256-2278 (September 2023). DOI: 10.1214/22-AOAS1717


The study of temporal dynamics of gender and ethnic stereotypes is an important topic in many disciplines at the intersection between statistics and social sciences. In this paper we make use of word “embeddings,” a common tool in natural language processing and of Bayesian nonparametric mixture modeling for the analysis of temporal dynamics of gender stereotypes in adjectives and occupation over the 20th and 21st centuries in the United States. Our Bayesian nonparametric approach relies on a novel dependent Dirichlet process prior, and it allows for both dynamic density estimation and dynamic clustering of adjective embedding and occupation embedding biases in a hierarchical setting. Posterior inference is performed through a particle Markov chain Monte Carlo algorithm, which is simple and computationally efficient. An application to time-dependent data for adjective embedding bias and for occupation embedding bias shows that our approach enables the quantification of historical trends of gender stereotypes and hence allows to identify how specific adjectives and occupations have become more closely associated with a female rather than male over time.

Funding Statement

Second author also affiliated to IMATI-CNR “Enrico Magenes” (Milan, Italy).


Maria De Iorio is also affiliated to the Department of Statistical Science, University College London. The authors are grateful to the Editor (Professor Brendan Murphy) and two anonymous referees for their comments and corrections that allowed us to improve the paper substantially. Stefano Favaro received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme under grant agreement No 817257. Stefano Favaro gratefully acknowledge the financial support from the Italian Ministry of Education, University and Research (MIUR), “Dipartimenti di Eccellenza” grant 2018-2022.


Received: 1 January 2021; Revised: 1 October 2022; Published: September 2023
First available in Project Euclid: 7 September 2023

MathSciNet: MR4637666
Digital Object Identifier: 10.1214/22-AOAS1717

Keywords: autoregressive models , Bayesian nonparametrics , dependent Dirichlet processes , dynamic density estimation and clustering , gender stereotypes , mixture modeling , particle Markov chain Monte Carlo , word embeddings

Rights: Copyright © 2023 Institute of Mathematical Statistics


Vol.17 • No. 3 • September 2023
