Data abundance is now the norm. With the proliferation of digital platforms, content generated by or from users has grown exponentially and with it, a growing recognition of the uses of such large data sets to provide insight into complex real-world problems. Demographic inference -- the prediction of population characteristics, such as gender, age, or geography -- from big data sources is emerging as a significant component of big data systems, with the promise of being used to inform decision making in a variety of sectors and industries.
The detection of a person’s gender and/or sexuality is often a key and foundational component of demographic inference. This survey was undertaken to provide an approachable overview of the various methods researchers and other actors are using to infer gender and sexuality from large data sets.
Please find below links to a brief explainer on methods of inference and types of data used for infererence, and a glossary of key terms; followed by the list of annotated papers.
Authored by Tasneem Mewa and Saman Goudarzi
Edited by Sumandro Chattapadhyay
Supported by the Big Data for Development network established by International Development Research Centre, Canada
Published on PubPub developed and hosted by MIT’s Knowledge Futures Group
Shared under Creative Commons Attribution 4.0 International license