Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in neural information processing systems (pp. 4349-4357).
The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. We show that even word embeddings trained on Google News articles exhibit female/male gender stereotypes to a disturbing extent. This raises concerns because their widespread use, as we describe, often tends to amplify these biases. Geometrically, gender bias is first shown to be captured by a direction in the word embedding. Second, gender neutral words are shown to be linearly separable from gender definition words in the word embedding. Using these properties, we provide a methodology for modifying an embedding to remove gender stereotypes, such as the association between the words receptionist and female, while maintaining desired associations such as between the words queen and female. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduce gender bias in embeddings while preserving its useful properties such as the ability to cluster related concepts and to solve analogy tasks. The resulting embeddings can be used in applications without amplifying gender bias.
Identifying the Dataset
Word embeddings act as a repository for software - words are mapped as vectors, the closer the vectors are to one another, the more similar they are in meaning. Using word2vec 300 dimensional embedding, the authors test Google News texts consisting of a total of 3 million English words. The authors call their dataset w2vNEWS. Along with w2vNEWS, the authors also extend their experiments to other publicly available datasets.
Because word embeddings are based on vector groupings and distances, the authors measure the distances between words (as closer vectors represent similar semantics) to test the existence of biases. In order to establish a notion of relativity, the authors test both gender specific (e.g. brother, sister) and gender neutral words.
Beyond identifying subspaces and vector dimensions that exhibit bias, the authors postulate ways to use these subspace and target them with neutralising words that equalise the vector relations within and external to said subspace.
These methods rely heavily on definitions and categorisations of gender neutral/ specific words, as well as various calculus and statistics formulae to detect direct and indirect bias.
Key Assumptions Stated by Authors
While acknowledging the numerous studies written on the practical applications of word embeddings, the authors explicitly acknowledge that these papers fail to recognise the sexism inherent in word embeddings themselves. Moreover, Bolukbasi et. al express their concerns over these inherent sexisms reproducing themselves in real-world tools and systems. In addition to calling out assumptions in previous works, the authors are mindful of the biases in their own dataset - they hypothesised that Google would present fewer gender biases as most of their contributions come from professional journalists; however, this was not the case. In their attempt to debias word-embeddings, the authors admit that this alone is insufficient - these norms must be deconstructed societally as well as through computer programming to reduce the incidence of inadvertent biases circulating.
The authors did an excellent job of highlighting the qualitative implications of quantitative studies. Where most scholars neglect the gendered realities of their work, these authors had them at the forefront of their objectives. However, with other authors and works in mind, I would encourage the authors to elaborate on the concept of societal gender stereotypes. Perhaps, future work can analyse how related works missed opportunities to employ a gendered lens and how this may enrich their work.
The main goal of this experiment, in addition to emphasising the extent of gender bias within various algorithmic systems, is to evaluate de-biasing mechanisms. To evaluate subspaces that had been neutralised, the authors ran analogy generation tasks (as they did with the initial raw dataset) to generate words that fall under the category of “she” and “he.” Originally, 19% of the words generated were judged as stereotypical, whereas only 6% of debiased algorithms presented as conforming to gender conventions.