Data-driven systems globally continue to treat gender as a binary, which has a significant impact on policy making, access to services, and advocacy for non-binary groups.
Goal 5 of the Sustainable Development Goals, pertaining to gender equality, treats gender as a binary - despite some initiatives linked to the SDGs focusing on non-binary groups.
Official statistics largely treat gender as binary, erasing the identities and bodies of non-binary people in data.
It has been challenging to classify gender indentities along the spectrum of “non-binary”, given different self-identifications and historical groupings, and gender fluidity.
Gender classification in big data and algorithmic systems has been binary, leading to misclassification and possible discrimination against non-binary groups.
Individuals can self-identify with gender identities that are different from their biological sex, or sex assigned at birth, which could be still different from gender expression, or the gender they present to society. Official statistics (and therefore open data) globally largely hold binary conceptions of gender, that is, allowing respondents to identify as either men or women. Further, biological sex is usually conflated with gender identity, as either identification systems, or data collection instruments, or both, may not allow individuals to self-identify their gender identities. As of 2015, only Malta, Ireland, and Japan allow individuals to self-identify their gender.1 This results in most gender data treating gender as a binary, with non-gender binary people either being misrepresented or excluded entirely.
Coding gender as a binary erases non-confirming genders and leads to socio-technical problems since it is technically impossible for persons outside of the binary to register for services and respond accurately to data collection exercises. This is especially the case when processes demand mandatory gender identification without self-identification, resulting in exclusion from statistics, benefits, and services on labour and livelihood, housing, nutrition and health, among others.2
Unlike the other articles in this handbook, this entry does not deal primarily with debates within the field of open data, but rather non-gender binary data in the SDGs, official statistics, and social media data as an example of a private industry conducting gendered data on a large scale. This is because there is a dearth of literature and initiatives in the open data field focusing on gender non-binary data.
The SDGs do not explicitly acknowledge the relationship between sexuality, gender identity and development.3 The underlying goal of the SDGs however, is to ‘leave no one behind’, which should require meaningful engagement with the LGBTQIA community. However, Goal 5 focuses exclusively on equality between men and women or girls and boys, which has an adverse negative impact on the overall achievement of the SDGs.4 The experiences of sexual minorities whose sexual orientation, gender identity and expression, or sex characteristics do not conform to cultural norms or expectation are nowhere to be seen in the targets and indicators of Goal 5 or in the other 16 goals.5
An outcome document released in 2015 applies the language of non-discrimination to persons of ‘other status’, which includes sexual orientation and gender identity deriving from two resolutions passed by the Human Rights Council in 2009.6 Therefore, while the interpretation of the term ‘gender’ in the SDGs has come under severe criticism, the inclusion of ‘other status’ has been appreciated as a partial success emerging from enormous mobilisation by civil society organisations that have advocated for the inclusion of members of the LGBTQIA community.7 Notably, 12 UN agencies endorsed a separate UN statement8 on ending violence and discrimination based on gender identity and sexual orientation.9 This finds no explicit mention in the goal on SDG on gender equality itself.10
The SDGs have, however, been instrumental in advocacy and research on gender and sexuality, for example by enabling the comparative mapping of structural marginalisation of LGBTQI+ communities in different countries through different goals.11 By addressing gender inequality in all its manifestations, any effort to implement SDGs would be better positioned to accomplish not only goal 5, but the SDGs overall, and thus genuinely aid in fulfilling its promise of ‘leaving no one behind’.12
“What gets counted counts” notes feminist geographer Joni Seager - “without the right categories for recording gender, the right kind of gender data cannot be collected, and without the right kind of data there can be no social change.”13 People falling outside of the gender binary tend to form the minority in most populations, which has been used as an excuse to exclude them from data collection exercises entirely.14 Maria Munir, a trans rights activist based in the UK, asserts that “If you refuse to register non-binary people like me with birth certificates, and exclude us in everything from creating bank accounts to signing up for mailing lists, you do not have the right to turn around and say that there are not enough of us to warrant change”.15
Data collection systems need to produce better practices to deal with outliers and minority populations and ensure their representation in relevant datasets, while respecting their self-identification and privacy.16 Quantitative data collection is important to support advocacy measures for policy change and to capture structural inequality at scale without making marginalisation seem individualised and anecdotal.17 Feminist sociologist Ann Oakley argues that without quantitative research, it can be difficult to distinguish between personal experiences and collective oppression.18 For instance, trans communities in Asia and the Pacific have historically faced gross human rights violations, invisibility and isolation in education and employment, and exclusion from health care systems, and are invisiblised.19 Structural oppression at this scale can be captured and advocated for through quantitative data.
There is also substantial literature on surveillance and criminalisation of non-gender binary people, which includes elaborate and discriminatory forms of data collection and usage.20 Data on gender and sexual minorities is thus captured by some government programmes, but may result in greater marginalisation and persecution of such groups.
There is a need for states, private sector, and civil society to develop contextual and localised best practices for data collection of non-binary gender data, in consultation with people from local LGBTQI+ groups. This entails addressing challenges in collection of this kind of data. To begin with, there are a spectrum of identities that fall within the scope of “non-binary”, such that it is very difficult to arrive at any exhaustive definition of what the term constitutes.21 For example, Indian law treats the transgender as an absence, constituting of people that are neither male or female, or a combination of the two.22 However, India is also one of the few states that publicly released census data that included a ‘third gender’ category for transgender people.23
Determining the right set of categories for recording non-binary people has proven to be challenging - for example, the term ‘transgender’ is employed to refer to gender diverse groups, but many of these groups don’t identify with the term.24 Transsexual is generally considered a subset of transgender, but some transsexual people reject being labelled as trasngender.25 Many forms of self-identification are now entering literature, including pangender, genderqueer, polygender, and agender. Gender identities are also extremely complex. Apart from difficulty with classification, the non-static nature of gender as a category makes it difficult to capture transitioning identities. Several individuals feel that their gender identity shifts from time to time, a concept known as gender fluidity.26
In the context of international development, donors from the Global North may aim to drive LGBTQI+ rights agendas in the South by employing conventional human rights frameworks and Northern LGBTQ social movement frameworks. However, terminology for ‘LGBTQ’ does not at all adequately or inclusively represent local understandings of sexuality and gender identity.27 may leave out some groups of people, and may also be seen as an invasion of Western ideology in local contexts. In a study of gender and sexuality in several countries. This was highlighted in the Rwandan , Ethiopian, and Indian cases; however, in the same Rwandan case study and in Nepal, activists adopted the LGBT label and identities to enable political action or to supersede negative, localised stereotypes of sexual and gender minorities.28
Practices of data collection and survey design can also hugely impact the results of collection. Proxy-survey methods and fear of state reprisal imply that there can be false data in state data collection exercises.2930 Further, providing gender identity cannot be made mandatory due to concerns around privacy and consent.31 The design of data collection exercises need to address the complexity of gender identification and concerns around privacy.
There are several initiatives involving government and civil society that have been instituted to address some of these issues. Examples include the National LGBTQ Taskforce32 and the Gender Identity in US group - a collaboration of scientists, scholars, and trasngender leaders dedicated to increasing knowledge about gender-related measurement and promoting the inclusion of this data in surveys, especially public-funded ones.33 Such initiatives are yet to be emerge substantively in open data spaces. Urgent work is required to conceptualise, design, and advocate for inclusive gender data beyond the binary.
In response to pressure from its users, some platforms such as Facebook, Google+, OKCupid, among others have expanded beyond the gender binary at the stage of registration. Facebook, for example has expanded from 2 to 58 gender options for English speakers in the UK and US. These categories have been created in consultation with the LGBTQIA community, also providing an option to input custom gender.34 Despite this, Facebook still reprograms and collapses all genders into a binary system at the backend as part of the categories available to advertisers.35 Self-identification, fluidity, and user agency, even though supported at a surface level, are severed in favor of the binary.36 This design strategy simultaneously satisfies both the user and the marketing-advertisement institutions, thus maintaining public-facing progressive politics while bolstering hegemonic regimes of gender control.37
Computational techniques, artificial intelligence, and big data applications that deal with gender have invariably treated it as a binary. Many applications have attempted to design predictive modelling and analytics in order to predict or detect the gender of a user, and are unable to recognise genders beyond the binary.38 Similarly, facial recognition systems do not perform well on people identifying gender and sexual minorities.39 The very basis of gender prediction, the assumption that users’ genders can be predicted through digital behaviour and physical appearance, needs to be critiqued. Gendrendr, for example, describes itself as addressing concerns around assigning gender to an individual, containing a simple set of functions designed to highlight the inaccuracy and violence of assigning genders to others.40
This article summarises the major debates around the collection of non-binary gender data by state and private entities. These debates are not addressed in mainstream literature around open data and gender, indicating a significant gap in research. Literature dealing with the SDGs and official statistics indicates the exclusion of non-binary gender data in such databases, leading to the exclusion of gender and sexual minorities and deprivation of services or weakening advocacy efforts. Some non-binary gender groups are conversely included in databases that result in surveillance and criminalisation. Finally, literature on big data and social media data systems illustrates the commercialisation of gendered data, which can risk forcing it back into binary forms through essentialization of statistical representations. These regress our collective understanding of gender back into binary categories despite decades of local, national, regional, and global advocacy efforts to force states to recognise non-binary gender identites and rights of sexual minority groups.