You are browsing the site of a past edition of the AIES conference (2021). Navigate to present edition here.

## Poster session 2

Wed 19 15:30 PDT Wed 19 16:30 PDT

### #137 Ethical Implementation of Artificial Intelligence to Select Embryos in In Vitro Fertilization

Michael Anis Mihdi Afnan, Cynthia Rudin, Vincent Conitzer, Julian Savulescu, Abhishek Mishra, Yanhe Liu, Masoud Afnan

AI has the potential to revolutionize many areas of healthcare. Radiology, dermatology, and ophthalmology are some of the areas most likely to be impacted in the near future, and they have received significant attention from the broader research community. But AI techniques are now also starting to be used in in vitro fertilization (IVF), in particular for selecting which embryos to transfer to the woman. The contribution of AI to IVF is potentially significant, but must be done carefully and transparently, as the ethical issues are significant, in part because this field involves creating new people.We first give a brief introduction to IVF and review the use of AI for embryo selection. We discuss concerns with the interpretation of the reported results from scientific and practical perspectives. We then consider the broader ethical issues involved. We discuss in detail the problems that result from the use of black-box methods in this context and advocate strongly for the use of interpretable models. Importantly, there have been no published trials of clinical effectiveness, a problem in both the AI and IVF communities, and we therefore argue that clinical implementation at this point would be premature. Finally, we discuss ways for the broader AI community to become involved to ensure scientifically sound and ethically responsible development of AI in IVF.

### #122 Automating Procedurally Fair Feature Selection in Machine Learning

Clara Belitz, Lan Jiang, Nigel Bosch

In recent years, machine learning has become more common in everyday applications. Consequently, numerous studies have explored issues of unfairness against specific groups or individuals in the context of these applications. Much of the previous work on unfairness in machine learning has focused on the fairness of outcomes rather than process. We propose a feature selection method inspired by fair process (procedural fairness) in addition to fair outcome. Specifically, we introduce the notion of unfairness weight, which indicates how heavily to weight unfairness versus accuracy when measuring the marginal benefit of adding a new feature to a model. Our goal is to maintain accuracy while reducing unfairness, as defined by six common statistical definitions. We show that this approach demonstrably decreases unfairness as the unfairness weight is increased, for most combinations of metrics and classifiers used. A small subset of all the combinations of datasets (4), unfairness metrics (6), and classifiers (3), however, demonstrated relatively low unfairness initially. For these specific combinations, neither unfairness nor accuracy were affected as unfairness weight changed, demonstrating that this method does not reduce accuracy unless there is also an equivalent decrease in unfairness. We also show that this approach selects unfair features and sensitive features for the model less frequently as the unfairness weight increases. As such, this procedure is an effective approach to constructing classifiers that both reduce unfairness and are less likely to include unfair features in the modeling process.

### #74 Uncertainty as a Form of Transparency: Measuring, Communicating, and Using Uncertainty

Umang Bhatt, Javier Antoran, Yunfeng Zhang, Vera Liao, Prasanna Sattigeri, Riccardo Fogliato, Gabrielle Gauthier Melançon, Ranganath Krishnan, Jason Stanley, Omesh Tickoo, Lama Nachman, Rumi Chunara, Madhulika Srikumar, Adrian Weller, Alice Xiang

Algorithmic transparency entails exposing system properties to various stakeholders for purposes that include understanding, improving, and contesting predictions. Until now, most research into algorithmic transparency has predominantly focused on explainability. Explainability attempts to provide reasons for a machine learning model’s behavior to stakeholders. However, understanding a model’s specific behavior alone might not be enough for stakeholders to gauge whether the model is wrong or lacks sufficient knowledge to solve the task at hand. In this paper, we argue for considering a complementary form of transparency by estimating and communicating the uncertainty associated with model predictions. First, we discuss methods for assessing uncertainty. Then, we characterize how uncertainty can be used to mitigate model unfairness, augment decision-making, and build trustworthy systems. Finally, we outline methods for displaying uncertainty to stakeholders and recommend how to collect information required for incorporating uncertainty into existing ML pipelines. This work constitutes an interdisciplinary review drawn from literature spanning machine learning, visualization/HCI, design, decision-making, and fairness. We aim to encourage researchers and practitioners to measure, communicate, and use uncertainty as a form of transparency.

### #72 AI Alignment and Human Reward

Patrick Butlin.

According to a prominent approach to AI alignment, AI agents should be built to learn and promote human values. However, humans value things in several different ways: we have desires and preferences of various kinds, and if we engage in reinforcement learning, we also have reward functions. One research project to which this approach gives rise is therefore to say which of these various classes of human values should be promoted. This paper takes on part of this project by assessing the proposal that human reward functions should be the target for AI alignment. There is some reason to believe that powerful AI agents which were aligned to values of this form would help us to lead good lives, but there is also considerable uncertainty about this claim, arising from unresolved empirical and conceptual issues in human psychology.

### #252 What’s Fair about Individual Fairness?

Will Fleisher

One of the main lines of research in algorithmic fairness involves individual fairness (IF) methods. Individual fairness is motivated by an intuitive principle, similar treatment, which requires that similar individuals be treated similarly. IF offers a precise account of this principle using distance metrics to evaluate the similarity of individuals. Proponents of individual fairness have argued that it gives the correct definition of algorithmic fairness, and that it should therefore be preferred to other methods for determining fairness. I argue that individual fairness cannot serve as a definition of fairness. Moreover, IF methods should not be given priority over other fairness methods, nor used in isolation from them. To support these conclusions, I describe four in-principle problems for individual fairness as a definition and as a method for ensuring fairness: (1) counterexamples show that similar treatment (and therefore IF) are insufficient to guarantee fairness; (2) IF methods for learning similarity metrics are at risk of encoding human implicit bias; (3) IF requires prior moral judgments, limiting its usefulness as a guide for fairness and undermining its claim to define fairness; and (4) the incommensurability of relevant moral values makes similarity metrics impossible for many tasks. In light of these limitations, I suggest that individual fairness cannot be a definition of fairness, and instead should be seen as one tool among several for ameliorating algorithmic bias.

### #36 Learning to Generate Fair Clusters from Demonstrations

Sainyam Galhotra, Sandhya Saisubramanian, Shlomo Zilberstein

Fair clustering is the process of grouping similar entities together, while satisfying a mathematically well-defined fairness metric as a constraint. Due to the practical challenges in precise model specification, the prescribed fairness constraints are often incomplete and act as proxies to the intended fairness requirement. Clustering with proxies may lead to biased outcomes when the system is deployed. We examine how to identify the intended fairness constraint for a problem based on limited demonstrations from an expert. Each demonstration is a clustering over a subset of the data. We present an algorithm to identify the fairness metric from demonstrations and generate clusters using existing off-the-shelf clustering techniques, and analyze its theoretical properties. To extend our approach to novel fairness metrics for which clustering algorithms do not currently exist, we present a greedy method for clustering. Additionally, we investigate how to generate interpretable solutions using our approach. Empirical evaluation on three real-world datasets demonstrates the effectiveness of our approach in quickly identifying the underlying fairness and interpretability constraints, which are then used to generate fair and interpretable clusters.

### #205 Computing Plans that Signal Normative Compliance

Alban Grastien, Claire Benn, Sylvie Thiebaux

There has been increasing acceptance that agents must act in a way that is sensitive to ethical considerations. These considerations have been cashed out as constraints, such that some actions are permissible, while others are impermissible. In this paper, we claim that, in addition to only performing those actions that are permissible, agents should only perform those courses of action that are _unambiguously_ permissible. By doing so they signal normative compliance: they communicate their understanding of, and commitment to abiding by, the normative constraints in play. Those courses of action (or plans) that succeed in signalling compliance in this sense, we term `acceptable’. The problem this paper addresses is how to compute plans that signal compliance, that is, how to find plans that are acceptable as well as permissible. We do this by identifying those plans such that, were an observer to see only part of its execution, that observer would infer the plan enacted was permissible. This paper provides a formal definition of compliance signalling within the domain of AI planning, describes an algorithm for computing compliance signalling plans, provides preliminary experimental results and discusses possible improvements. The signalling of compliance is vital for communication, coordination and cooperation in situations where the agent is partially observed. It is equally vital, therefore, to solve the computational problem of finding those plans that signal compliance. This is what this paper does.

### #2 An AI Ethics Course Highlighting Explicit Ethical Agents

Nancy Green

This is an experience report describing a pilot AI Ethics course for undergraduate computer science majors. In addition to teaching students about different ethical approaches and using them to analyze ethical issues, the course covered how ethics has been incorporated into the implementation of explicit ethical agents, and required students to implement an explicit ethical agent for a simple application. This report describes the course objectives and design, the topics covered, and a qualitative evaluation with suggestions for future offerings of the courses.

### #147 The Dangers of Drowsiness Detection: Differential Performance, Downstream Impact, and Misuses

Jakub Grzelak, Martim Brandao

Drowsiness and fatigue are important factors in driving safety and work performance. This has motivated academic research into detecting drowsiness, and sparked interest in the deployment of related products in the insurance and work-productivity sectors. In this paper we elaborate on the potential dangers of using such algorithms. We first report on an audit of performance bias across subject gender and ethnicity, identifying which groups would be disparately harmed by the deployment of a state-of-the-art drowsiness detection algorithm. We discuss some of the sources of the bias, such as the lack of robustness of facial analysis algorithms to face occlusions, facial hair, or skin tone. We then identify potential downstream harms of this performance bias, as well as potential misuses of drowsiness detection technology—focusing on driving safety and experience, insurance cream-skimming and coverage-avoidance, worker surveillance, and job precarity.

### #67 Who Gets What, According to Whom? An Analysis of Fairness Perceptions in Service Allocation

Jacqueline Hannan, Huei-Yen Winnie Chen, Kenneth Joseph

Algorithmic fairness research has traditionally been linked to the disciplines of philosophy, ethics, and economics, where notions of fairness are prescriptive and seek objectivity. Increasingly, however, scholars are turning to the study of what different people perceive to be fair, and how these perceptions can or should help to shape the design of machine learning, particularly in the policy realm. The present work experimentally explores five novel research questions at the intersection of the "Who," "What," and "How" of fairness perceptions. Specifically, we present the results of a multi-factor conjoint analysis study that quantifies the effects of the specific context in which a question is asked, the framing of the given question, and who is answering it. Our results broadly suggest that the "Who" and "What," at least, matter in ways that are 1) not easily explained by any one theoretical perspective, 2) have critical implications for how perceptions of fairness should be measured and/or integrated into algorithmic decision-making systems.

### #104 The Earth Is Flat and the Sun Is Not a Star: The Susceptibility of GPT-2 to Universal Adversarial Triggers

Hunter Heidenreich, Jake Williams

This work considers universal adversarial triggers, a method of adversarially disrupting natural language models, and questions if it is possible to use such triggers to affect both the topic and stance of conditional text generation models. In considering four ”controversial” topics, this work demonstrates success at identifying triggers that cause the GPT-2 model to produce text about targeted topics as well as influence the stance the text takes towards the topic. We show that, while the more fringe topics are more challenging to identify triggers for, they do appear to more effectively discriminate aspects like stance. We view this both as an indication of the dangerous potential for controllability and, perhaps, a reflection of the nature of the disconnect between conflicting views on these topics, something that future work could use to question the nature of filter bubbles and if they are reflected within models trained on internet content. In demonstrating the feasibility and ease of such an attack, this work seeks to raise the awareness that neural language models are susceptible to this influence–even if the model is already deployed and adversaries lack internal model access–and advocates the immediate safeguarding against this type of adversarial attack in order to prevent potential harm to human users.

### #48 Situated Accountability: Ethical Principles, Certification Standards, and Explanation Methods in Applied AI

Anne Henriksen, Simon Enni, Anja Bechmann

Artificial intelligence (AI) has the potential to benefit humans and society by its employment in important sectors. However, the risks of negative consequences have underscored the importance of accountability for AI systems, their outcomes, and the users of such systems. In recent years, various accountability mechanisms have been put forward in pursuit of the responsible design, development, and use of AI. In this article, we provide an in-depth study of three such mechanisms, as we analyze Scandinavian AI developers’ encounter with (1) ethical principles, (2) certification standards, and (3) explanation methods. By doing so, we contribute to closing a gap in the literature between discussions of accountability on the research and policy level, and accountability as a responsibility put on the shoulders of developers in practice. Our study illustrates important flaws in the current enactment of accountability as an ethical and social value which, if left unchecked, risks undermining the pursuit of responsible AI. By bringing attention to these flaws, the article signals where further work is needed in order to build effective accountability systems for AI.

### #254 Towards Equity and Algorithmic Fairness in Student Grade Prediction

Weijie Jiang, Zachary Pardos

Equity of educational outcome and fairness of AI with respect to race have been topics of increasing importance in education. In this work, we address both with empirical evaluations of grade prediction in higher education, an important task to improve curriculum design, plan interventions for academic support, and offer course guidance to students. With fairness as the aim, we trial several strategies for both label and instance balancing to attempt to minimize differences in algorithm performance with respect to race. We find that an adversarial learning approach, combined with grade label balancing, achieved by far the fairest results. With equity of educational outcome as the aim, we trial strategies for boosting predictive performance on historically underserved groups and find success in sampling those groups in inverse proportion to their historic outcomes. With AI-infused technology supports increasingly prevalent on campuses, our methodologies fill a need for frameworks to consider performance trade-offs with respect to sensitive student attributes and allow institutions to instrument their AI resources in ways that are attentive to equity and fairness.

### #198 The Ethical Gravity Thesis: Marrian Levels and the Persistence of Bias in Automated Decision-Making Systems

Computers are used to make decisions in an increasing number of domains. There is widespread agreement that some of these uses are ethically problematic. Far less clear is where ethical problems arise, and what might be done about them. This paper expands and defends the Ethical Gravity Thesis: ethical problems that arise at higher levels of analysis of an automated decision-making system are inherited by lower levels of analysis. Particular instantiations of systems can add new problems, but not ameliorate more general ones. We defend this thesis by adapting Marr’s famous 1982 framework for understanding information-processing systems. We show how this framework allows one to situate ethical problems at the appropriate level of abstraction, which in turn can be used to target appropriate interventions.

### #190 Exciting, Useful, Worrying, Futuristic: Public Perception of Artificial Intelligence in 8 Countries

Patrick Gage Kelley, Yongwei Yang, Courtney Heldreth, Christopher Moessner, Aaron Sedley, Andreas Kramm, David T. Newman, Allison Woodruff

As the influence and use of artificial intelligence (AI) have grown and its transformative potential has become more apparent, many questions have been raised regarding the economic, political, social, and ethical implications of its use. Public opinion plays an important role in these discussions, influencing product adoption, commercial development, research funding, and regulation. In this paper we present results of an in-depth survey of public opinion of artificial intelligence conducted with 10,005 respondents spanning eight countries and six continents. We report widespread perception that AI will have significant impact on society, accompanied by strong support for the responsible development and use of AI, and also characterize the public’s sentiment towards AI with four key themes (exciting, useful, worrying, and futuristic) whose prevalence distinguishes response to AI in different countries.

### #212 Age Bias in Emotion Detection: Analysis of Facial Emotion Recognition Performance on Varying Age Groups

Eugenia Kim, De’Aira Bryant, Deepak Srikanth, Ayanna Howard

The growing potential for facial emotion recognition (FER) technology has encouraged expedited development at the cost of rigorous validation. Many of its use-cases may also impact the diverse global community as FER becomes embedded into domains ranging from education to security to healthcare. Yet, prior work has highlighted that FER can exhibit both gender and racial biases like other facial analysis techniques. As a result, bias-mitigation research efforts have mainly focused on tackling gender and racial disparities, while other demographic related biases, such as age, have seen less progress. This work seeks to examine the performance of state of the art commercial FER technology on expressive images of men and women from three distinct age groups. We utilize four different commercial FER systems in a black box methodology to evaluate how six emotions – anger, disgust, fear, happiness, neutrality, and sadness – are correctly detected by age group. We further investigate how algorithmic changes over the last year have affected system performance. Our results found that all four commercial FER systems most accurately perceived emotion in images of young adults and least accurately in images of older adults. This trend was observed for analyses conducted in 2019 and 2020. However, little to no gender disparities were observed in either year. While older adults may not have been the initial target consumer of FER technology, statistics show the demographic is quickly growing more keen to applications that use such systems. Our results demonstrate the importance of considering various demographic subgroups during FER system validation and the need for inclusive, intersectional algorithmic developmental practices.

### #246 AI and Shared Prosperity

Katya Klinova, Anton Korinek

Future advances in AI that automate away human labor may have stark implications for labor markets and inequality. This paper proposes a framework to analyze the effects of specific types of AI systems on the labor market, based on how much labor demand they will create versus displace, while taking into account that productivity gains also make society wealthier and thereby contribute to additional labor demand. This analysis enables ethically-minded companies creating or deploying AI systems as well as researchers and policymakers to take into account the effects of their actions on labor markets and inequality, and therefore to steer progress in AI in a direction that advances shared prosperity and an inclusive economic future for all of humanity.

### #161 Towards Unifying Feature Attribution and Counterfactual Explanations: Different Means to the Same End

Ramaravind Kommiya Mothilal, Divyat Mahajan, Chenhao Tan, Amit Sharma

Feature attributions and counterfactual explanations are popular approaches to explain a ML model. The former assigns an importance score to each input feature, while the latter provides input examples with minimal changes to alter the model’s predictions. To unify these approaches, we provide an interpretation based on the actual causality framework and present two key results in terms of their use. First, we present a method to generate feature attribution explanations from a set of counterfactual examples. These feature attributions convey how important a feature is to changing the classification outcome of a model, especially on whether a subset of features is \textit{necessary} and/or \textit{sufficient} for that change, which attribution-based methods are unable to provide. Second, we show how counterfactual examples can be used to evaluate the goodness of an attribution-based explanation in terms of its necessity and sufficiency. As a result, we highlight the complementarity of these two approaches.Our evaluation on three benchmark datasets — \adult, \lclub, and \german~— confirms the complementarity. Feature attribution methods like LIME and SHAP and counterfactual explanation methods like \wachtershort and DiCE often do not agree on feature importance rankings. In addition, by restricting the features that can be modified for generating counterfactual examples, we find that the top-k features from LIME or SHAP are often neither necessary nor sufficient explanations of a model’s prediction. Finally, we present a case study of different explanation methods on a real-world hospital triage problem.

### #172 Becoming Good at AI for Good

Meghana Kshirsagar, Caleb Robinson, Siyu Yang, Shahrzad Gholami, Ivan Klyuzhin, Sumit Mukherjee, Md Nasir, Anthony Ortiz, Felipe Oviedo Perhavec, Darren Tanner, Anusua Trivedi, Yixi Xu, Ming Zhong, Bistra Dilkina, Rahul Dodhia, Juan Lavista Ferres

AI for good (AI4G) projects involve developing and applying artificial intelligence (AI) based solutions to further goals in areas such as sustainability, health, humanitarian aid, and social justice. Developing and deploying such solutions must be done in collaboration with partners who are experts in the domain in question and who already have experience in making progress towards such goals. Based on our experiences, we detail the different aspects of this type of collaboration broken down into four high-level categories: communication, data, modeling, and impact, and distill eleven takeaways to guide such projects in the future. We briefly describe two case studies to illustrate how some of these takeaways were applied in practice during our past collaborations.

### #141 A Framework for Understanding AI-Induced Field Change: How AI Technologies Are Legitimized and Institutionalized

Benjamin Larsen

Artificial intelligence (AI) systems operate in increasingly diverse areas, from healthcare to facial recognition, the stock market, autonomous vehicles, and so on. While the underlying digital infrastructure of AI systems is developing rapidly, each area of implementation is subject to different degrees and processes of legitimization. By combining elements from institutional theory and information systems-theory, this paper presents a conceptual framework to analyze and understand AI-induced field-change. The introduction of novel AI-agents into new or existing fields creates a dynamic in which algorithms (re)shape organizations and institutions while existing institutional infrastructures determine the scope and speed at which organizational change is allowed to occur. Where institutional infrastructure and governance arrangements, such as standards, rules, and regulations, still are unelaborate, the field can move fast but is also more likely to be contested. The institutional infrastructure surrounding AI-induced fields is generally little elaborated, which could be an obstacle to the broader institutionalization of AI-systems going forward.

### #171 Ethical Data Curation for AI: An Approach Based on Feminist Epistemology and Critical Theories of Race

Susan Leavy, Eugenia Siapera, Barry O’Sullivan

The potential for bias embedded in data to lead to the perpetuation of social injustice though Artificial Intelligence (AI) necessitates an urgent reform of data curation practices for AI systems, especially those based on machine learning. Without appropriate ethical and regulatory frameworks there is a risk that decades of advances in human rights and civil liberties may be undermined. This paper proposes an approach to data curation for AI, grounded in feminist epistemology and informed by critical theories of race and feminist principles. The objective of this approach is to support critical evaluation of the social dynamics of power embedded in data for AI systems. We propose a set of fundamental guiding principles for ethical data curation that address the social construction of knowledge, call for inclusion of subjugated and new forms of knowledge, support critical evaluation of theoretical concepts within data and recognise the reflexive nature of knowledge. In developing this ethical framework for data curation, we aim to contribute to a virtue ethics for AI and ensure protection of fundamental and human rights.

### #79 Risk Identification Questionnaire for Unintended Bias in Machine Learning Development Lifecycle

Michelle Seng Ah Lee, Jatinder Singh

Unintended biases in machine learning (ML) models have the potential to introduce undue discrimination and exacerbate social inequalities. The research community has proposed various technical and qualitative methods intended to assist practitioners in assessing these biases. While frameworks for identifying the risks of harm due to unintended biases have been proposed, they have not yet been operationalised into practical tools to assist industry practitioners.In this paper, we link prior work on bias assessment methods to phases of a standard organisational risk management process (RMP), noting a gap in measures for helping practitioners identify bias- related risks. Targeting this gap, we introduce a bias identification methodology and questionnaire, illustrating its application through a real-world, practitioner-led use case. We validate the need and usefulness of the questionnaire through a survey of industry practitioners, which provides insights into their practical requirements and preferences. Our results indicate that such a questionnaire is helpful for proactively uncovering unexpected bias concerns, particularly where it is easy to integrate into existing processes, and facilitates communication with non-technical stakeholders.Ultimately, the effective end-to-end management of ML risks requires a more targeted identification of potential harm and its sources, so that appropriate mitigation strategies can be formulated. Towards this, our questionnaire provides a practical means to assist practitioners in identifying bias-related risks.

### #114 Feeding the Beast: Superintelligence, Corporate Capitalism and the End of Humanity

Dominic Leggett

Scientists and philosophers have warned of the possibility that humans, in the future, might create a ‘superintelligent’ machine that could, in some scenarios, form an existential threat to humanity. This paper argues that such a machine may already exist, and that, if so, it does, in fact, represent such a threat.

### #177 How Do the Score Distributions of Subpopulations Influence Fairness Notions?

Carmen Mazijn, Jan Danckaert, Vincent Ginis

Automated decisions based on trained algorithms influence human life in an increasingly far-reaching way. In recent years, it has become clear that these decisions are often accompanied by bias and unfair treatment of different subpopulations.Meanwhile, several notions of fairness circulate in the scientific literature, with trade-offs between profit and fairness and between fairness metrics among themselves. Based on both analytical calculations and numerical simulations, we show in this study that some profit-fairness trade-offs and fairness-fairness trade-offs depend substantially on the underlying score distributions given to subpopulations and we present two complementary perspectives to visualize this influence. We further show that higher symmetry in scores of subpopulations can significantly reduce the trade-offs between fairness notions within a given acceptable strictness, even when sacrificing expressiveness. Our exploratory study may help to understand how to overcome the strict mathematical statements about the statistical incompatibility of certain fairness notions.

### #99 More Similar Values, More Trust? The Effect of Value Similarity on Trust in Human-Agent Interaction

Siddharth Mehrotra, Catholijn M. Jonker, Myrthe L. Tielman

As AI systems are increasingly involved in decision making, it also becomes important that they elicit appropriate levels of trust from their users. To achieve this, it is first important to understand which factors influence trust in AI. We identify that a research gap exists regarding the role of personal values in trust in AI. Therefore, this paper studies how human and agent Value Similarity (VS) influences a human’s trust in that agent. To explore this, 89 participants teamed up with five different agents, which were designed with varying levels of value similarity to that of the participants. In a within-subjects, scenario-based experiment, agents gave suggestions on what to do when entering the building to save a hostage. We analyzed the agent’s scores on subjective value similarity, trust and qualitative data from open-ended questions. Our results show that agents rated as having more similar values also scored higher on trust, indicating a positive effect between the two. With this result, we add to the existing understanding of human-agent trust by providing insight into the role of value-similarity.

### #134 Causal Multi-Level Fairness

Algorithmic systems are known to impact marginalized groups severely, and more so, if all sources of bias are not considered. While work in algorithmic fairness to-date has primarily focused on addressing discrimination due to individually linked attributes, social science research elucidates how some properties we link to individuals can be conceptualized as having causes at macro (e.g. structural) levels, and it may be important to be fair to attributes at multiple levels. For example, instead of simply considering race as a causal, protected attribute of an individual, the cause may be distilled as perceived racial discrimination an individual experiences, which in turn can be affected by neighborhood-level factors. This multi-level conceptualization is relevant to questions of fairness, as it may not only be important to take into account if the individual belonged to another demographic group, but also if the individual received advantaged treatment at the macro-level. In this paper, we formalize the problem of multi-level fairness using tools from causal inference in a manner that allows one to assess and account for effects of sensitive attributes at multiple levels. We show importance of the problem by illustrating residual unfairness if macro-level sensitive attributes are not accounted for, or included without accounting for their multi-level nature. Further, in the context of a real-world task of predicting income based on macro and individual-level attributes, we demonstrate an approach for mitigating unfairness, a result of multi-level sensitive attributes.

### #127 Epistemic Reasoning for Machine Ethics with Situation Calculus

Maurice Pagnucco, David Rajaratnam, Raynaldio Limarga, Abhaya Nayak, Yang Song

With the rapid development of autonomous machines such as selfdriving vehicles and social robots, there is increasing realisation that machine ethics is important for widespread acceptance of autonomous machines. Our objective is to encode ethical reasoning into autonomous machines following well-defined ethical principles and behavioural norms. We provide an approach to reasoning about actions that incorporates ethical considerations. It builds on Scherl and Levesque’s [29, 30] approach to knowledge in the situation calculus. We show how reasoning about knowledge in a dynamic setting can be used to guide ethical and moral choices, aligned with consequentialist and deontological approaches to ethics. We apply our approach to autonomous driving and social robot scenarios, and provide an implementation framework.

### #215 Quantum Fair Machine Learning

Elija Perrier

In this paper, we inaugurate the field of quantum fair machine learning. We undertake a comparative analysis of differences and similarities between classical and quantum fair machine learning algorithms, specifying how the unique features of quantum computation alter measures, metrics and remediation strategies when quantum algorithms are subject to fairness constraints. We present the first results in quantum fair machine learning by demonstrating the use of Grover’s search algorithm to satisfy statistical parity constraints imposed on quantum algorithms. We provide lower-bounds on iterations needed to achieve such statistical parity within $\epsilon$-tolerance. We extend canonical Lipschitz-conditioned individual fairness criteria to the quantum setting using quantum metrics. We examine the consequences for typical measures of fairness in machine learning context when quantum information processing and quantum data are involved. Finally, we propose open questions and research programmes for this new field of interest to researchers in computer science, ethics and quantum computation.

### #187 Measuring Model Fairness under Noisy Covariates: A Theoretical Perspective

Flavien Prost, Pranjal Awasthi, Alex Beutel, Jilin Chen, Nick Blumm, Ed H. Chi, Li Wei, Aditee Kumthekar, Trevor Potter, Xuezhi Wang

In this work we study the problem of measuring the fairness of a machine learning model under noisy information. Focusing on group fairness metrics, we investigate the particular but common situation when the evaluation requires controlling for the confounding effect of covariate variables. In a practical setting, we might not be able to jointly observe the covariate and group information, and a standard workaround is to then use proxies for one or more of these variables. Prior works have demonstrated the challenges with using a proxy for sensitive attributes, and strong independence assumptions are needed to provide guarantees on the accuracy of the noisy estimates. In contrast, in this work we study using a proxy for the covariate variable and present a theoretical analysis that aims to characterize weaker conditions under which accurate fairness evaluation is possible.Furthermore, our theory identifies potential sources of errors and decouples them into two interpretable parts $\epsobs$ and $\epsun$. The first part $\epsobs$ depends solely on the performance of the proxy such as precision and recall, whereas the second part $\epsun$ captures correlations between all the variables of interest.We show that in many scenarios the error in the estimates is dominated by $\epsobs$ via a linear dependence, whereas the dependence on the correlations $\epsun$ only constitutes a lower order term. As a result we expand the understanding of scenarios where measuring model fairness via proxies can be an effective approach. Finally, we compare, via simulations, the theoretical upper-bounds to the distribution of simulated estimation errors and show that assuming some structure on the data, even weak, is key to significantly improve both theoretical guarantees and empirical results.

### #268 Face Mis-ID: Interrogating Facial Recognition Harms with Community Organizers Using an Interactive Demo

Daniella Raz, Corinne Bintz, Vivian Guetler, Michael Katell, Dharma Dailey, Bernease Herman, P. M. Krafft, Meg Young

This paper reports on the making of an interactive demo to illustrate algorithmic bias in facial recognition. Facial recognition technology has been demonstrated to be more likely to misidentify women and minoritized people. This risk, among others, has elevated facial recognition into policy discussions across the country, where many jurisdictions have already passed bans on its use. Whereas scholarship on the disparate impacts of algorithmic systems is growing, general public awareness of this set of problems is limited in part by the illegibility of machine learning systems to non-specialists. Inspired by discussions with community organizers advocating for tech fairness issues, we created the Face Mis-ID Demo to reveal the algorithmic functions behind facial recognition technology and to demonstrate its risks to policymakers and members of the community. In this paper, we share the design process behind this interactive demo, its form and function, and the design decisions that honed its accessibility, toward its use for improving legibility of algorithmic systems and awareness of the sources of their disparate impacts.

### #233 GAEA: Graph Augmentation for Equitable Access via Reinforcement Learning

Govardana Sachithanandam Ramachandran, Ivan Brugere, Lav R. Varshney, Caiming Xiong

Disparate access to resources by different subpopulations is a prevalent issue in societal and sociotechnical networks. For example, urban infrastructure networks may enable certain racial groups to more easily access resources such as high-quality schools, grocery stores, and polling places. Similarly, social networks within universities and organizations may enable certain groups to more easily access people with valuable information or influence. Here we introduce a new class of problems, Graph Augmentation for Equitable Access (GAEA), to enhance equity in networked systems by editing graph edges under budget constraints. We prove such problems are NP-hard, and cannot be approximated within a factor of $(1-\tfrac{1}{3e})$. We develop a principled, sample- and time- efficient Markov Reward Process (MRP)-based mechanism design framework for GAEA. Our algorithm outperforms baselines on a diverse set of synthetic graphs. We further demonstrate the method on real-world networks, by merging public census, school, and transportation datasets for the city of Chicago and applying our algorithm to find human-interpretable edits to the bus network that enhance equitable access to high-quality schools across racial groups. Further experiments on Facebook networks of universities yield sets of new social connections that would increase equitable access to certain attributed nodes across gender groups.

### #159 The Theory, Practice, and Ethical Challenges of Designing a Diversity-Aware Platform for Social Relations

Laura Schelenz, Ivano Bison, Matteo Busso, Amalia de Götzen, Daniel Gatica-Perez, Fausto Giunchiglia, Lakmal Meegahapola, Salvador Ruiz-Correa

Diversity-aware platform design is a paradigm that responds to the ethical challenges of existing social media platforms. Available platforms have been criticized for minimizing users’ autonomy, marginalizing minorities, and exploiting users’ data for profit maximization. This paper presents a design solution that centers the well-being of users. It presents the theory and practice of designing a diversity-aware platform for social relations. In this approach, the diversity of users is leveraged in a way that allows like-minded individuals to pursue similar interests or diverse individuals to complement each other in a complex activity. The end users of the envisioned platform are students, who participate in the design process. Diversity-aware platform design involves numerous steps, of which two are highlighted in this paper: 1) defining a framework and operationalizing the "diversity" of students, 2) collecting "diversity" data to build diversity-aware algorithms. The paper further reflects on the ethical challenges encountered during the design of a diversity-aware platform.

### #226 FaiR-N: Fair and Robust Neural Networks for Structured Data

Shubham Sharma, Alan Gee, David Paydarfar, Joydeep Ghosh

This paper presents a fairness principle that can be used to evaluate decision-making based on predictions. We propose that a decision rule for decision-making based on predictions is fair when the individuals directly subjected to the implications of the decision enjoy fair equality of chances. We define fair equality of chances as obtaining if and only if the individuals who are equal with respect to the features that justify inequalities in outcomes have the same statistical prospects of being benefited or harmed, irrespective of their morally irrelevant traits. The paper characterizes – in a formal way – the way in which luck is allowed to impact outcomes in order for its influence to be fair. This fairness principle can be used to evaluate decision-making based on predictions, a kind of decision- making that is becoming increasingly important to theorize around in light of the growing prevalence of algorithmic decision-making in healthcare, the criminal justice system, and the insurance industry, among other areas. It can be used to evaluate decision-making rules based on different normative theories and is compatible with the broadest range of normative views according to which inequalities due to brute luck can be fair.

### #32 Machine Learning and the Meaning of Equal Treatment

Approaches to non-discrimination are generally informed by two principles: striving for equality of treatment, and advancing various notions of equality of outcome. We consider when and why there are trade-offs in machine learning between respecting formalistic interpretations of equal treatment and advancing equality of outcome. Exploring a hypothetical discrimination suit against Facebook, we argue that interpretations of equal treatment which require blindness to difference may constrain how machine learning can be deployed to advance equality of outcome. When machine learning models predict outcomes that are unevenly distributed across racial groups, using those models to advance racial justice will often require deliberately taking race into account.We then explore the normative stakes of this tension. We describe three pragmatic policy options underpinned by distinct interpretations and applications of equal treatment. A status quo approach insists on blindness to difference, permitting the design of machinelearning models that compound existing patterns of disadvantage. An industry-led approach would specify a narrow set of domains in which institutions were permitted to use protected characteristics to actively reduce inequalities of outcome. A government-led approach would impose positive duties that require institutions to consider how best to advance equality of outcomes and permit the use of protected characteristics to achieve that goal. We argue that while machine learning offers significant possibilities for advancing racial justice and outcome-based equality, harnessing those possibilities will require a shift in the normative commitments that underpin the interpretation and application of equal treatment in non-discrimination law and the governance of machine learning.

### #259 Differentially Private Normalizing Flows for Privacy-Preserving Density Estimation

Chris Waites, Rachel Cummings

Normalizing flow models have risen as a popular solution to the problem of density estimation, enabling high-quality synthetic data generation as well as exact probability density evaluation. However, in contexts where individuals are directly associated with the training data, releasing such a model raises privacy concerns. In this work, we propose the use of normalizing flow models that provide explicit differential privacy guarantees as a novel approach to the problem of privacy-preserving density estimation. We evaluate the efficacy of our approach empirically using benchmark datasets, and we demonstrate that our method substantially outperforms previous state-of-the-art approaches. We additionally show how our algorithm can be applied to the task of differentially private anomaly detection.

### #119 A Human-in-the-Loop Framework to Construct Context-Aware Mathematical Notions of Outcome Fairness

Mohammad Yaghini, Hoda Heidari, Andreas Krause

Existing mathematical notions of fairness fail to account for the \textbf{context} of decision-making. We argue that moral consideration of contextual factors is an inherently \emph{human} task. So we present a framework to learn \emph{context-aware} mathematical formulations of fairness by eliciting people’s \emph{situated fairness assessments}. Our family of fairness notions corresponds to a new interpretation of economic models of \emph{Equality of Opportunity (EOP)}, and it includes most existing notions of fairness as special cases. Our \emph{human-in-the-loop} approach is designed to learn the appropriate parameters of the EOP family by utilizing human responses to pair-wise questions about decision subjects’ \emph{circumstance} and \emph{deservingness}, and the \emph{harm/benefit} imposed on them. We illustrate our framework in a hypothetical criminal risk assessment scenario by conducting a series of human-subject experiments on Amazon Mechanical Turk. Our work takes an important initial step toward empowering stakeholders to have a voice in the formulation of fairness for Machine Learning.

### #46 RelEx: A Model-Agnostic Relational Model Explainer

Yue Zhang, David Defazio, Arti Ramesh

In recent years, considerable progress has been made on improving the interpretability of machine learning models. This is essential, as complex deep learning models with millions of parameters produce state of the art performance, but it can be nearly impossible to explain their predictions. While various explainability techniques have achieved impressive results, nearly all of them assume each data instance to be independent and identically distributed (iid). This excludes relational models, such as Statistical Relational Learning (SRL), and the recently popular Graph Neural Networks (GNNs), resulting in few options to explain them. While there does exist work on explaining GNNs, GNN-Explainer, they assume access to the gradients of the model to learn explanations, which is restrictive in terms of its applicability across non-differentiable relational models and practicality. In this work, we develop \emph{RelEx}, a \textit{model-agnostic} relational explainer to explain black-box relational models with only access to the outputs of the black-box. \emph{RelEx} is able to explain any relational model, including SRL models and GNNs. We compare \emph{RelEx} to the state-of-the-art relational explainer, GNN-Explainer, and relational extensions of iid explanation models and show that \emph{RelEx} achieves comparable or better performance, while remaining model-agnostic.

### #240 Skilled and Mobile: Survey Evidence of AI Researchers’ Immigration Preferences

Remco Zwetsloot, Baobao Zhang, Noemi Dreksler, Lauren Kahn, Markus Anderljung, Allan Dafoe, Michael C. Horowitz

Countries, companies, and universities are increasingly competing over top-tier artificial intelligence (AI) researchers. Where are these researchers likely to immigrate and what affects their immigration decisions? We conducted a survey $(n = 524)$ of the immigration preferences and motivations of researchers that had papers accepted at one of two prestigious AI conferences: the Conference on Neural Information Processing Systems (NeurIPS) and the International Conference on Machine Learning (ICML). We find that the U.S. is the most popular destination for AI researchers, followed by the U.K., Canada, Switzerland, and France. A country’s professional opportunities stood out as the most common factor that influences immigration decisions of AI researchers, followed by lifestyle and culture, the political climate, and personal relations. The destination country’s immigration policies were important to just under half of the researchers surveyed, while around a quarter noted current immigration difficulties to be a deciding factor. Visa and immigration difficulties were perceived to be a particular impediment to conducting AI research in the U.S., the U.K., and Canada. Implications of the findings for the future of AI talent policies and governance are discussed.