November 2023

IZA DP No. 16593: Using Domain-Specific Word Embeddings to Examine the Demand for Skills

forthcoming in: Research in Labor Economics

We study the demand for skills by using text analysis methods on job descriptions in a large volume of ads posted on an online Indian job portal. We make use of domain-specific unlabeled data to obtain word vector representations (i.e., word embeddings) and discuss how these can be leveraged for labor market research. We start by carrying out a data-driven categorization of required skill words and construct gender associations of different skill categories using word embeddings. Next, we examine how different required skill categories correlate with log posted wages as well as explore how skills demand varies with firm size. We find that female skills are associated with lower posted wages, potentially contributing to observed gender wage gaps. We also find that large firms require a more extensive range of skills, implying that complementarity between female and male skills is greater among these firms.