Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012, January). Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference (pp. 214-226).
We study fairness in classification, where individuals are classified, e.g., admitted to a university, and the goal is to prevent discrimination against individuals based on their membership in some group, while maintaining utility for the classifier (the university). The main conceptual contribution of this paper is a framework for fair classification comprising (1) a (hypothetical) task-specific metric for determining the degree to which individuals are similar with respect to the classification task at hand; (2) an algorithm for maximizing utility subject to the fairness constraint, that similar individuals are treated similarly. We also present an adaptation of our approach to achieve the complementary goal of “fair affirmative action,” which guarantees statistical parity (i.e., the demographics of the set of individuals receiving any classification are the same as the demographics of the underlying population), while treating similar individuals as similarly as possible. Finally, we discuss the relationship of fairness to privacy: when fairness implies privacy, and how tools developed in the context of differential privacy may be applied to fairness.
Identifying the Dataset
This study examines classification itself and how the act of classifying groups of people can be fairer, in the context of school admissions or credit ratings, for example. Rather than testing a dataset, these authors set out to establish a fairness framework based on various metrics and optimisation calculations.
When formulating the framework, many factors are taken into account. Fairness is understood as two similar individuals, that are equally qualified, should be classified together. The concept of awareness arises from measuring individual-based fairness via the distance between two individuals. This works in conjunction with the approach to classifiers, that being a randomised mapping of individuals to determine the distribution of characteristics to outcomes. The various factors taken into consideration, as listed in the paper, are as follows: setting up an optimisation problem, establishing a relationship between individual and group fairness, integrating statistical parity, taking privacy concerns into account, and trying to minimise any harms or deficiencies in the information at hand after having gone through fairness considerations.
Key Assumptions Stated by Authors
While attempting to capture ground truth, the authors recognise that classifiers often enable non-membership in a certain group to exclude qualified individuals from certain tasks. The authors also ask the reverse question, can classifiers hide information? In this case, there is a debate between the level of disaggregation necessary to guarantee fairness. In discussing several case studies along the lines of finance and health, in some instances, disaggregated data and dissimilarity is the fairer outcome, whereas in some situations, disaggregating data can lead to instances of discrimintion against certain groups - particularly in the context of gendered groups of folks from the LGBTQ+ community.
This study acknowledges the inequality attached to the biases within algorithmic classifiers and machine learning, simply due to the nature of classifiers and their inability to successfully and comprehensively detect bias. In doing so, fairness, and the awareness of fairness become central tenets of the research. To further this question and line of thought, I would encourage the researchers to consider equity - beyond fairness - how can statistical parity capture the fact that some individuals may be facing more barriers than others when being classified in a certain group. When considering this in a statistical or metric fashion, how can this change the classifiers mechanism for preferential treatment and or individual fairness?
The authors validate their findings by proposing that this framework can enforce fairness, can detect instances of unfairness, and certify fair actions through classification algorithms. Therefore, classifiers can be evaluated even if they are not attached to a dataset. The authors also acknowledge that distance as a metric (which is central to the formulating fairness optimisation) may not always be a relevant variable in some scenarios; however, the authors provide ways in which this metric can potentially remain consistent across a variety of sectors and contexts.