Strategies

Principle of transparency, key considerations

"It is important to remember that transparency as a principle of AI ethics _differs a bit in meaning from the everyday use of the term. The common dictionary understanding of transparency defines it as _either _(1) the quality an object has when one can see clearly through it or (2) the quality of a situation or process that can be clearly justified and explained because it is open to inspection and free from secrets.Transparency as a principle of AI ethics encompasses _both _of these meanings:On the one hand, transparent AI involves the interpretability of a given AI system, i.e. the ability to know how and why a model performed the way it did in a specific context and therefore to understand the rationale behind its decision or behaviour. This sort of transparency is often referred to by way of the metaphor of ‘opening the black box’ of AI. It involves _content clarification and intelligibility _or explicability.On the other hand, transparent AI involves the justifiability both of the processes that go into its design and implementation and of its outcome. It therefore involves the _soundness of the justification of its use. In this more normative meaning, transparent AI is practically justifiable _in an unrestricted way if one can demonstrate that both the design and implementation processes that have gone into the particular decision or behaviour of a system and the decision or behaviour itself are ethically permissible, non-discriminatory/fair, and worthy of public trust/safety-securing.Three critical tasks for designing and implementing transparent AIThis two-pronged definition of transparency as a principle of AI ethics asks that you to think about transparent AI both in terms of the _process _behind it (the design and implementation practices that lead to an algorithmically supported outcome) and in terms of its _product _(the content and justification of that outcome). Such a process/product distinction is crucial, because it clarifies the three tasks that your team will be responsible for in safeguarding the transparency of your AI project:·Process Transparency, Task 1: Justify Process. In offering an explanation to affected stakeholders, you should be able to demonstrate that considerations of ethical permissibility, non-discrimination/fairness, and safety/public trustworthiness were operative end-to-end in the design and implementation processes that lead to an automated decision or behaviour. This task will be supported both by following the best practices outlined herein throughout the AI project lifecycle and by putting into place robust auditability measures through an accountability-by-design framework.·Outcome Transparency, Task 2: Clarify Content and Explain Outcome. In offering an explanation to affected stakeholders, you should be able to show in plain language that is understandable to non-specialists how and why a model performed the way it did in a specific decision-making or behavioural context. You should therefore be able to clarify and communicate the rationale behind its decision or behaviour. This explanation should be _socially meaningful _in the sense that the terms and logic of the explanation should not simply reproduce the formal characteristics or the technical meanings and rationale of the mathematical model but should rather be translated into the everyday language of human practices and therefore be understandable in terms of the societal factors and relationships that the decision or behaviour implicates.·Outcome Transparency, Task 3: Justify Outcome. In offering an explanation to affected stakeholders, you should be able to demonstrate that a specific decision or behaviour of your system is ethically permissible, non-discriminatory/fair, and worthy of public trust/safety- securing. This outcome justification should take the content clarification/explicated outcome from task 2 as its starting point and weigh that explanation against the justifiability criteria adhered to throughout the design and use pipeline: ethical permissibility, non- discrimination/fairness, and safety/public trustworthiness. Undertaking an optimal approach to process transparency from the start should support and safeguard this demand for normative explanation and outcome justification."Leslie, 2019, p.30-31

## "Process Transparency: Establishing a Process-Based Governance FrameworkThe central importance of the end-to-end operability of good governance practices should guide your strategy to build out responsible AI project workflow processes. Three components are essential to creating a such a responsible workflow: (1) Maintaining strong regimes of professional and institutional transparency; (2) Having a clear and accessible Process-Based Governance Framework (PBG Framework); (3) Establishing a well-defined auditability trail in your PBG Framework through robust activity logging protocols that are consolidated digitally in a process log.1.Professional and Institutional Transparency: At every stage of the design and implementation of your AI project, team members should be held to rigorous standards of conduct that secure and maintain professionalism and institutional transparency. These standards should include the core values of integrity, honesty, sincerity, neutrality, objectivity and impartiality. All professionals involved in the research, development, production, and implementation of AI technologies are, first and foremost, acting as fiduciaries of the public interest and must, in keeping with these core values of the Civil Service, put the obligations to serve that interest above any other concerns.Furthermore, from start to finish of the AI project lifecycle, the design and implementation process should be as transparent and as open to public scrutiny as possible with restrictions on accessibility to relevant information limited to the reasonable protection of justified public sector confidentiality and of analytics that may tip off bad actors to methods of gaming the system of service provision.2.Process-Based Governance Framework: So far, this guide has presented some of the main steps that are necessary for establishing responsible innovation practices in your AI project. Perhaps the most vital of these measures is the effective operationalisation of the values and principles that underpin the development of ethical and safe AI. By organising all of your governance considerations and actions into a PBG Framework, you will be better able to accomplish this task.The purpose of a PBG Framework is to provide a template for the integrations of the norms, values, and principles, which motivate and steer responsible innovation, with the actual processes that characterise the AI design and development pipeline. While the accompanying Guide has focused primarily on the Cross Industry Standard Process for Data Mining (CRISP-DM), keep in mind that such a structured integration of values and principles with innovation processes is just as applicable in other related workflow models like Knowledge Discovery in Databases (KDD) and Sample, Explore, Modify, Model, and Assess (SEMMA).Your PBG Framework should give you a landscape view of the governance procedures and protocols that are organising the control structures of your project workflow. Constructing a good PBG Framework will provide you and your team with a big picture of:·The relevant team members and roles involved in each governance action.·The relevant stages of the workflow in which intervention and targeted consideration are necessary to meet governance goals·Explicit timeframes for any necessary follow-up actions, re-assessments, and continual monitoring·Clear and well-defined protocols for logging activity and for instituting mechanisms to assure end-to-end auditability"(Leslie, 2019, p31-33)"
1. Enabling Auditability with a Process Log: With your controls in place and your governance framework organised, you will be better able to manage and consolidate the information necessary to assure end-to-end auditability. This information should include both the records and activity-monitoring results that are yielded by your PBG Framework and the model development data gathered across the modelling, training, testing, verifying, and implementation phases.By centralising your information digitally in a process log, you are preparing the way for optimal process transparency. A process log will enable you to make available, in one place, information that may assist you in demonstrating to concerned parties and affected decision subjects both the responsibility of design and use practices and the justifiability of the outcomes of your system’s processing behaviour.Such a log will also allow you to differentially organise the accessibility and presentation of the information yielded by your project. Not only is this crucial to preserving and protecting data that legitimately should remain unavailable for public view, it will afford your team the capacity to cater the presentation of results to different tiers of stakeholders with different interests and levels of expertise. This ability to curate your explanations with the user- receiver in mind will be vital to achieving the goals of interpretable and justifiable AI.

## Outcome transparency: Explaining outcomeand clarifying contentBeyond enabling process transparency through your PBG Framework, you must also put in place standards and protocols to ensure that clear and understandable explanations of the outcomes of your AI system’s decisions, behaviours, and problem-solving tasks can: 1.Properly inform the evidence-based judgments of the implementers that they are designed to support;2.Be offered to affected stakeholders and concerned parties in an accessible way.This is a multifaceted undertaking that will demand careful forethought and participation across your entire project team.There is no simple technological solution for how to effectively clarify and convey the rationalebehind a model’s output in a particular decision-making or behavioural context. Your team will have to use sound judgement and common sense in order to bring together the technical aspects of choosing, designing, using a sufficiently interpretable AI system and the delivery aspects of being able to clarify and communicate in plain, non-technical, and socially meaningful language how and why that system performed the way it did in a specific decision-making or behavioural context.Having a good grasp of the rationale and criteria behind the decision-making and problem-solving behaviour of your system is essential for producing safe, fair, and ethical AI. If your AI model is not sufficiently interpretable—if you aren’t able to draw from it humanly understandable explanations of the factors that played a significant role in determining its behaviours—then you may not be able to tell how and why things go wrong in your system when they do.This is a crucial and unavoidable issue for reasons we have already explored. Ensuring the safety of high impact systems in transportation, medicine, infrastructure, and security requires human verification that these systems have properly learned the critical tasks they are charged to complete. It also requires confirmation that when confronted with unfamiliar circumstances, anomalies, and perturbations, these systems will not fail or make unintuitive errors. Moreover, ensuring that these systems operate without causing discriminatory harms requires effective ways to detect and to mitigate sources of bias and inequitable influence that may be buried deep within their feature spaces, inferences, and architectures. Without interpretability each one of these tasks necessary for delivering safe and morally justifiable AI will remain incomplete.Defining Interpretable AITo gain a foothold in both the technical and delivery dimensions of AI interpretability, you will first need a solid working definition of what interpretable AI is. To this end, it may be useful to recall once again the definition of AI offered in the accompanying Guide: ‘Artificial Intelligence is the science of _making computers do things that require intelligence when done by humans.’This characterisation is important, because it brings out an essential feature of the explanatory demands of interpretable AI: to do things that require intelligence when done by humans means to do things that require reasoning processes and cognitive functioning. This cognitive dimension has a direct bearing on how you should think about offering suitable explanations about algorithmically generated outcomes:Explaining an algorithmic model’s decision or behaviour should involve making explicit how the particular set of factors which determined that outcome can play the role of evidence in supporting the conclusion reached. It should involve making intelligible to affected individuals the rationale behind that decision or behaviour as if it had been produced by a reasoning, evidence-using, and inference-making person.What makes this explanation-giving task so demanding when it comes to AI systems is that reasoning processes do not occur, for humans, at just one level. Rather, human-scale reasoning and interpreting includes:1.Aspects of logic (applying the basic principles of validity that lie behind and give form to sound thinking): _This aspect aligns with the need for formal or logical explanations of AI systems.2.Aspects of semantics (gaining an understanding of how and why things work the way they do and what they mean): _This aspect aligns with the need for explanations of the technical rationale behind the outcomes AI systems.3.Aspects of the social understanding of practices, beliefs, and intentions (clarifying the content of interpersonal relations, societal norms, and individual objectives): _This aspect aligns with the need for the clarification of the socially meaningful content of the outcomes of AI systems.4.Aspects of moral justification (making sense of what should be considered right and wrong in our everyday activities and choices): _This aspect aligns with the justifiability of AI systems._There are good reasons why _all four of these dimensions of human reasoning processes _must factor in to explaining the decisions and behaviours of AI systems: First and most evidently, understanding the logic and technical innerworkings (i.e. semantic content) of these systems is a precondition for ensuring their safety and fairness. Secondly, because they are designed and used to achieve human objectives and to fulfil surrogate cognitive functions _in the everyday social world, we need to make sense of these systems in terms of the consequential roles that their decisions and behaviours play in that human reality. The social context of these outcomes matters greatly. Finally, because they actually affect individuals and society in direct and morally consequential ways, we need to be able to understand and explain their outcomes not just in terms of their mathematical logic, technical rationale, and social context but also in terms of the justifiability of their impacts on people.Delving more deeply into the technical and delivery aspects of interpretable AI will show how these four dimensions of human reasoning directly line up with the different levels of demand for explanations of the outcomes of AI systems. In particular, the logical and semantic dimensions will weigh heavily in technical considerations whereas the social and moral dimensions will be significant at the point of delivery.Note here, though, that these different dimensions of human reasoning are not necessarily mutually exclusive but build off and depend upon each other in significant and cascading ways. Approaching explanations of interpretable AI should therefore be treated holistically and inclusively. Technical explanation of the logic and rationale of a given model, for instance, should be seen as a support for the context-based clarification of its socially meaningful content, just as that socially meaningful content should be viewed as forming the basis of explaining an outcome’s moral justifiability. When considering how to make the outcomes of decision-making and problem-solving AI systems maximally transparent to affected stakeholders, you should take this rounded view of human reasoning into account, because it will help you address more effectively the spectrum of concerns that these stakeholders may have.Technical aspects of choosing, designing, and using an interpretable AI systemKeep in mind that, while, on the face of it, the task of choosing between the numerous AI and machine learning algorithms may seem daunting, it need not be so. By sticking to the priority of outcome transparency, you and your team will be able to follow some straightforward and simple guidelines for selecting sufficiently interpretable but optimally performing algorithmic techniques.Before exploring these guidelines, it is necessary to provide you with some background information to help you better understand what facets of explanation are actually involved in technically interpretable AI. A good grasp of what is actually needed from such an explanation will enable you to effectively target the interpretability needs of your AI project.Facets of explanation in technically interpretable AI: A good starting point for understanding how the technical dimension of explanation works in interpretable AI systems is to remember that these systems are largely mathematical models that carry out step-by-step computations in transforming sets of statistically interacting or independent inputs into sets of target outputs. Machine learning is, at bottom, just applied statistics and probability theory fortified with several other mathematical techniques. As such, it is subject to same methodologically rigorous requirements of logical validation as other mathematical sciences.Such a demand for rigour informs the facet of formal and logical explanation of AI systems that is sometimes called the mathematical glass box. This characterisation refers to the transparency of strictly formal explanation: No matter how complicated it is (even in the case of a deep neural net with a hundred million parameters), an algorithmic model is a closed system of effectively computable operations where rules and transformations are mechanically applied to inputs to determine outputs. In this restricted sense, all AI and machine learning models are fully intelligible and mathematically transparent if only formally and logically _so.This is an important characteristic of AI systems, because it makes it possible for supplemental and eminently interpretable computational approaches to model, approximate, and simplify even the most complex and high dimensional among them. In fact, such a possibility fuels some of the technical approaches to interpretable AI that will soon be explored.This formal way of understanding the technical explanation of AI and machine learning systems, however, has immediate limitations. It can tell us that a model is mathematically intelligible because it operates according to a collection of fixed operations and parameters, but it cannot tell us much about how or why the components of the model transformed a specified group of inputs into their corresponding outputs. It cannot tell us anything about the _rationale behind the algorithmic generation of a given outcome.This second dimension of technical explanation has to do with the semantic facet _of interpretable AI. A semantic explanation offers an interpretation of the functions of the individual parts of the algorithmic system in the generation of its output. Whereas formal and logical explanation presents an account of the stepwise application of the procedures and rules that comprise the formal framework of the algorithmic system, semantic explanation helps us to understand the meaning of those procedures and rules in terms of their purpose in the input-output mapping operation of the system, i.e. what role they play in determining the outcome of the model’s computation.The difficulties surrounding the interpretability of algorithmic decisions and behaviours arise in this semantic dimension of technical explanation. It is easiest to illustrate this by starting from the simplest case.When a machine learning model is very basic, the task of following the rationale of how it transforms a given set of inputs into a given set of outputs can be relatively unproblematic. For instance, in the simple linear regression, _y = a + bx + e, with a single predictor variable x _and a response variable _y, the predictive relationship of x to y is directly expressed in a regression coefficient b, representing the rate and direction at which y _is predicted to change as x changes. This hypothetical model is completely interpretable from the technical perspective for the following reasons:·Linearity: Any change in the value of the predictor variable is directly reflected in a change in the value of the response variable at a constant rate _b. The interpretable prediction yielded by the model can therefore be directly inferred. _This linearity dimension of predictive models has been an essential feature of the automated decision-making systems in many heavily regulated and high-impact sectors, because the predictions yielded have high inferential clarity and strength.·Monotonicity: When the value of the predictor changes in a given direction, the value of the response variable changes consistently either in the same or opposite direction. The interpretable prediction yielded by the model can thus be directly inferred. This monotonicity dimension is also a highly desirable interpretability condition of predictive models in many heavily regulated sectors, because it incorporates reasonable expectations about the consistent application of sector specific selection constraints into automated decision-making systems. So, for example, if the selection criteria to gain employment at an agency or firm includes taking an exam, a reasonable expectation of outcomes would be that if candidate A scored better than candidate B, then candidate B, all other things being equal, would not be selected for employment when A is not. A monotonic predictive model that uses the exam score as the predictor variable and application success as the response variable would, in effect, guarantee this expectation is met by disallowing situations where A scores better than B but B gets selected and A does not.·Non-Complexity: The number of features (dimensionality) and feature interactions is low enough and the mapping function is simple enough to enable a clear ‘global’ understanding of the function of each part of the model in relation to its outcome.While, all three of these desirable interpretability characteristics of the imagined model allow for direct and intuitive reasoning about the relation of the predictor and response variables, the model itself is clearly too minimal to capture the density of relationships and interactions between attributes in complex real-world situations where some degree of noisiness is unavoidable and the task of apprehending the subtleties of underlying data distributions is tricky. In fact, one of the great strides forward that has been enabled by the contemporary convergence of expanding computing power and big data availability with more advanced machine learning models has been exactly this capacity to better capture and model the intricate and complicated dynamics of real-world situations. Still, this incorporation of the complexity of scale into the models themselves has also meant significant challenges to the semantic dimension of the technical explanation of AI systems.As machine learning systems have come to possess both ever greater access to big data and increasing computing power, their designers have correspondingly been able both to enlarge the feature spaces (the number of input variables) of these systems and to turn to gradually more complex mapping functions. In many cases, this has meant vast improvements in the predictive and classificatory performance of more accurate and expressive models, but this has also meant the growing prevalence of non-linearity, non-monotonicity, and high-dimensional complexity in an expanding array of so-called ‘black-box’ models.Once high-dimensional feature spaces and complex functions are introduced into machine learning systems, the effects of changes in any given input become so entangled with the values and interactions of other inputs that understanding how individual components are transformed into outputs becomes extremely difficult. The complex and unintuitive curves of the decision functions of many of these models preclude linear and monotonic relations between their inputs and outputs.Likewise, the high-dimensionality of their optimisation techniques—frequently involving millions of parameters and complex correlations—ranges well beyond the limits of human-scale cognition and understanding."Leslie, 2019, p33-39"These rising tides of computational complexity and algorithmic opacity consequently pose a key challenge for the responsible design and deployment of safe, fair, and ethical AI systems: how should the potential to advance the public interest through the implementation of high performing but increasingly uninterpretable machine learning models be weighed against the tangible risks posed by the lack of interpretability of such systems? A careful answer to this question is, in fact, not so simple. While the trade-off between performance and interpretability may be real and important in _some domain-specific applications, in others there exist increasingly sophisticated developments of standard interpretable techniques such as regression extensions, decision trees, and rule lists that may prove just as effective for use cases where the need for transparency is paramount. Furthermore, supplemental interpretability tools, which function to make ‘black box’ models more semantically and qualitatively explainable are rapidly advancing day by day.These are all factors that you and your team should consider as you work together to decide on which models to use for your AI project. As a starting point for those considerations, let us now turn to some basic guidelines that may help you to steer that dialogue toward points of relevance and concern.Guidelines for designing and delivering a sufficiently interpretable AI systemYou should use the table below to begin thinking about how to integrate interpretability into your AI project. While aspects of this topic can become extremely technical, it is important to make sure that dialogue about making your AI system interpretable remains multidisciplinary and inclusive.Moreover, it is crucial that key stakeholders be given adequate consideration when deciding upon the delivery mechanisms of your project. These should include policy or operational design leads, the technical personnel in charge of operating the trained models, the implementers of the models, and the decision subjects, who are affected by their outcomes.Note that the first three guidelines focus on the big picture issues you will need to consider in order to incorporate interpretability needs into your project planning and workflow, whereas the last two guidelines shift focus to the user-centred requirements of designing and implementing a sufficiently interpretable AI system." (Leslie, 2019, p.39-40)

## Guidelines for designing and delivering a sufficiently interpretable AI system Guideline 1: Look first to context, potential impact, and domain-specific need when determining the interpretability requirements of your projectThere are several related factors that should be taken into account as you formulate your project’sapproach to interpretability:1. Type of application: Start by assessing both the kind of tool you are building and the environment in which it will apply. Clearly there is a big difference between a computer vision system that sorts handwritten employee feedback forms and one that sorts safety risks at a security checkpoint. Likewise, there is a big difference between a random forest model that triages applicants at a licencing agency and one that triages sick patients in an emergency department.Understanding your AI system’s purpose and context of application will give you abetter idea of the stakes involved in its use and hence also a good starting point to thinkabout the scope of its interpretability needs. For instance, low-stakes AI models that are not safety-critical, do not directly impact the lives of people, and do not process potentially sensitive social and demographic data will likely have a lower need for extensive resources to be dedicated to a comprehensive interpretability platform.2.Domain specificity: By acquiring solid domain knowledge of the environment in which your AI system will operate, you will gain better insight into any potential sector-specific standards of explanation or benchmarks of justification which should inform your approach to interpretability. Through such knowledge, you may also obtain useful information about organisational and public expectations regarding the scope, content, and depth of explanations that have been previously offered in relevant use cases.3.Existing technology: If one of the purposes of your AI project is to replace an existing algorithmic technology that may not offer the same sort of expressive power or performance level as the more advanced machine learning techniques that you are planning to deploy, you should carry out an assessment of the performance and interpretability levels of the existing technology. Acquiring this knowledge will provide you with an important reference point when you are considering possible trade-offs between performance and interpretability that may occur in your own prospective system. It will also allow you to weigh the costs and benefits of building a more complex system with higher interpretability-support needs in comparison to the costs and benefits of using a simpler model. Guideline 2: Draw on standard interpretable techniques when possibleIn order to actively integrate the aim of sufficient interpretability into your AI project, your team should approach the model selection and development process with the goal of finding the right fit between (1) domain-specific risks and needs, (2) available data resources and domain knowledge, and (3) task appropriate machine learning techniques. Effectively assimilating these three aspects of your use case requires open-mindedness and practicality.Often times, it may be the case that high-impact, safety-critical, or other potentially sensitive environments heighten demands for the thoroughgoing accountability and transparency of AI projects. In some of these instances, such demands may make choosing standard but sophisticated non-opaque techniques an overriding priority. These techniques may include decisions trees, linear regression and its extensions like generalised additive models, decision/rule lists, case-based reasoning, or logistic regression. In many cases, reaching for the ‘black box’ model first may not be appropriate and may even lead to inefficiencies in project development, because more interpretable models, which perform very well but do not require supplemental tools and techniques for facilitating interpretable outcomes, are also available.Again, solid domain knowledge and context awareness are key components here. In use cases where data resources lend to well-structured, meaningful representations and domain expertise can be incorporated into model architectures, interpretable techniques may often be more desirable than opaque ones. Careful data pre-processing and iterative model development can, in these cases, hone the accuracy of such interpretable systems in ways that may make the advantages gained by the combination of their performance and transparency outweigh thebenefits of more semantically intransparent approaches. In other use cases, however, data processing needs may disqualify the deployment of these sorts of straightforward interpretable systems. For instance, when AI applications are sought for classifying images, recognising speech, or detecting anomalies in video footage, the most effective machine learning approaches will likely be opaque. The feature spaces of these kinds of AI systems grow exponentially to hundreds of thousands or even millions of dimensions. At this scale of complexity, conventional methods of interpretation no longer apply. Indeed, it is the unavoidability of hitting such an interpretability wall for certain important applications of supervised, unsupervised, and reinforcement learning that has given rise to an entire subfield of machine learning research which focuses on providing technical tools to facilitate interpretable and explainable AI.When the use of ‘black box’ models best fits the purpose of your AI project, you should proceed diligently and follow the procedures recommended in Guideline
1. For clarity, let us define a ‘black box’ model as any AI system whose innerworkings and rationale are opaque or inaccessible to human understanding. These systems may include neural networks (including recurrent, convolutional, and deep neural nets), ensemble methods (an algorithmic technique such as the random forest method that strengthens an overall prediction by combining and aggregating the results of several or many different base models), and support vector machines (a classifier that uses a special type of mapping function to build a divider between two sets of features in a high dimensional feature space). Guideline 3: When considering the use of ‘black box’ AI systems, you should:1.Thoroughly weigh up impacts and risks;2.Consider the options available for supplemental interpretability tools that will ensure a level of semantic explanation which is both domain appropriate and consistent with the design and implementation of safe, fair, and ethical AI;3.Formulate an interpretability action plan, so that you and your team can put adequate forethought into how explanations of the outcomes of your system’s decisions, behaviours, or problem-solving tasks can be optimally provided to users, decision subjects, and other affected parties.It may be helpful to explore each of these three suggested steps of assessing the viability of theresponsible design and implementation of a ‘black box’ model in greater detail.(1)Thoroughly weigh up impacts and risks: Your first step in evaluating the feasibility of using a complex AI system should be to focus on issues of ethics and safety. As a general policy, you and your team should utilise ‘black box’ models only:·where their potential impacts and risks have been thoroughly considered in advance, and you and your team have determined that your use case and domain specific needs support the responsible design and implementations of these systems; ·where supplemental interpretability tools provide your system with a domain appropriate level of semantic explainability that is reasonably sufficient to mitigate its potential risks and that is therefore consistent with the design and implementation of safe, fair, and ethical AI.(2) Consider the options available for supplemental interpretability tools: Next, you and your team should assess whether there are technical methods of explanation-support that both _satisfy the specific interpretability needs of your use case as determined by the deliberations suggested in Guideline 1 _and _are appropriate for the algorithmic approach you intend to use. You should consult closely with your technical team at this stage of model selection. The exploratory processes of trial-and-error, which often guide this discovery phase in the innovation lifecycle, should be informed and constrained by a solid working knowledge of the technical art of the possible in the domain of available and useable interpretability approaches.The task of lining up the model selection process with the demands of interpretable AI requires a few conceptual tools that will enable thoughtful evaluation of whether proposed supplemental interpretability approaches sufficiently meet your project’s explanatory needs. First and most importantly, you should be prepared to ask the right questions when evaluating any given interpretability approach. This involves establishing with as much clarity as possible how the explanatory results of that approach can contribute to the user’s ability to offer solid, coherent, and reasonable accounts of the rationale behind any given algorithmically generated output. Relevant questions to ask that can serve this end are:·What sort of explanatory resources will the interpretability tool provide users and implementers in order (1) to enable them to exercise better-informed evidence-based judgments and (2) to assist them in offering plausible, sound, and reasonable accounts of the logic behind algorithmically generated output to affected individuals and concerned parties?·Will the explanatory resources that the interpretability tool offers be useful for providing affected stakeholders with a sufficient understanding of a given outcome?·How, if at all, might the explanatory resources offered by the tool be misleading or confusing?You and your team should take these questions as a starting point for evaluating prospective interpretability tools. These tools should be assessed in terms of their capacities to render the reasoning behind the decisions and behaviours of the uninterpretable ‘black box’ systems sufficiently intelligible to users and affected stakeholders given use case and domain specific interpretability needs.Keeping this in mind, there are two technical dimensions of supplemental interpretability approaches that should be systematically incorporated into evaluation processes at this stage of the innovation workflow. The first involves the possible explanatory strategies you choose to pursue over the course of the design and implementation lifecycle. Such strategies will largely determine the paths to understanding you will be able to provide for its users and decision subjects. They will largely define _how you explain your model and its outcomes _and hence _what kinds of explanation you are able offer.The second involves the coverage and scope of the actual explanations themselves. The choices you make about explanatory coverage will determine the extent to which the kinds of explanations you are planning to pursue will address single instances _of the model’s outputs or range more broadly to cover the _underlying rationale of its behaviour in general and across instances. Choices you make about explanatory coverage will largely govern the extent to which your AI system is locally and/or globally interpretable.The very broad-brushed overview of these two dimensions that follows is just meant to orient you to some of the basic concepts in an expanding field of research, so that you are more prepared for working with your technical team to think through the strengths and weaknesses of various approaches. Note, additionally, that this is a rapidly developing area. Relevant members of your team should keep abreast of the latest developments in the field of interpretable AI or XAI (Explainable AI):Two technical dimensions of supplemental interpretability approaches:1.Determining explanatory strategies: To achieve the goal of securing a sufficiently interpretable AI system, you and your team will need to get clear on how to explain your model and its outcomes. The explanatory strategies you decide to pursue will shape the paths to understanding you are able to provide for the users of your model and for its decision subjects.There are four such explanatory strategies to which you should pay special attention:a)Internal explanation: _Pursuing the internal explanation of an opaque model involves making intelligible how the components and relationships within it function. There are two ways that such a goal of internal explanation can be interpreted. On the one hand, it can be seen as an endeavour to explain the operation of the model by considering it globally _as a comprehensible whole. Here, the aspiration is to ‘pry open the black box’ by building an explanatory model that enables a full grasp of the opaque system’s internal contents. The strengths and weaknesses of such an approach will be discussed in the next section on global interpretability.On the other hand, the search for internal explanation can indicate the pursuit a kind of engineering insight. In this sense, internal explanation can be seen as attempting to shed descriptive and inferential light on the parts and operation of the system as a whole in order to try to make it work better. Acquiring this sort of internal understanding of the more general relationships that the working parts of a trained model have with patterns of its responses can allow researchers to advance step-by-step in gaining a better data scientific grasp on why it does what it does and how to improve it. Similarly, this type of internal explanation can be seen as attempting to shed light on an opaque model’s operation by breaking it down into more understandable, analysable, and digestible parts (for instance, in the case of a DNN: into interpretable characteristics of its vectors, features, layers, parameters, etc.).From a practical point of view, this kind of aspiration to engineering insight _in the ends of data scientific advancement should inform the goals of your technical team throughout the model selection and design workflow.Numerous methods exist to help provide informative representations of the innerworkings of various ‘black box’ systems. Gaining a clearer descriptive understanding of the internal composition of your system will contribute greatly to your project’s ability to achieve a higher degree of outcome transparency and to its capacity to foster best practices in the pursuit of responsible data science in general.b)External or post-hoc explanation: External or post-hoc explanation attempts to capture essential attributes of the observable behaviour of a ‘black box’ system by subjecting it to a number of different techniques that reverse engineer explanatory insight. Some post-hoc approaches test the sensitivity of the outputs of an opaque model to perturbations in its inputs; others allow for the interactive probing of its behavioural characteristics; others, still, build proxy- based models that utilise simplified interpretable techniques to gain a better understanding of particular instances of its predictions and classifications.This external or post-hoc approach has, at present, established itself in machine learning research as a go-to explanatory strategy and for good reason. It allows data scientists to pose mathematical questions to their opaque systems by testing them and by building supplemental models which enable greater insight through the inferences drawn from their experimental interventions. Such a post-hoc approach allows them, moreover, to seek out evidence for the reasoning behind a given opaque model’s prediction or classification by utilising maximally interpretable techniques like linear regression, decision trees, rule lists, or case-based reasoning. Several examples of post-hoc explanation will be explored below in the section on local interpretability.Take note initially though that, as some critics have rightly pointed out, because they are approximations or simplified supplemental models of the more complex originals, many post-hoc explanations can fail to accurately represent certain areas of the opaque model’s feature space. This deterioration ofaccuracy in parts of the original model’s domain can frequently producemisleading and uncertain results in the post-hoc explanations of concern.c)Supplemental explanatory infrastructure: A different kind of explanatory strategy involves actually incorporating secondary explanatory facilities into the system you are building. For instance, an image recognition system could have a primary component, like a convolutional neural net, that extracts features from its inputs and classifies them while a secondary component, like a built-in recurrent neural net with an ‘attention-directing’ mechanism, translates the extracted features into a natural language representation that produces a sentence-long explanation of the result to the user. In other words, a system like this is designed to provide simple explanations of its own data processing results.Research into integrating ‘attention-based’ interfaces like this in AI systems is continuing to advance toward making their implementations more sensitive to user needs, more explanation-forward, and more human-understandable. For instance, multimodal methods of combining visualisation tools and textual interface are being developed that may make the provision of explanations more interpretable for both implementers and decision subjects. Furthermore, the incorporation of domain knowledge and logic-based or convention-based structures into the architectures of complex models are increasingly allowing for better and more user-friendly representations and prototypes to be built into them. This is gradually enabling more sophisticated explanatory infrastructures to be integrated into opaque systems and makes it essential to think about building explanation-by-design into your AI projects.d)Counterfactual explanation: While counterfactual explanation is a kind of post- hoc approach, it deserves special attention insofar as it moves beyond other post-hoc explanations to provide affected stakeholders with clear and precise options for actionable recourse and practical remedy.Counterfactual explanations are contrastive explanations: They offer succinct computational reckonings of how specific factors that influenced an algorithmic decision can be changed so that better alternatives can be realised by the subject of that decision. Incorporating counterfactual explanations into your AI system at its point of delivery would allow stakeholders to see what input variables of the model can be modified, so that the outcome could be altered to their benefit. Additionally, from a responsible design perspective, incorporating counterfactual explanation into the development and testing phases of your system would allow your team to build a model that incorporates _actionable variables, i.e. input variables that will afford decision subjects with concise options for making practical changes that would improve their chances of obtaining the desired outcome. Counterfactual explanatory strategies can be used as way to incorporate reasonableness and the encouragement of agency into the design and implementation of your AI project.All that said, it is important to recognise that, while counterfactual explanation does offer an innovative way to contrastively explore how feature importance may influence an outcome, it is not a complete solution to the problem of AI interpretability. In certain cases, for instance, the sheer number of potentially significant features that could be at play in counterfactual explanations of a given result can make a clear and direct explanation difficult to obtain and selected sets of explanations seem potentially arbitrary. Moreover, there are as yet limitations on the types of datasets and functions to which these kinds of explanations are applicable. Finally, because this kind of explanation concedes the opacity of the algorithmic model outright, it is less able to address concerns about potentially harmful feature interactions and multivariate relationshipsthat may be buried deep within the model’s architecture.Here is an at-a-glance view of a typology of these explanatory strategies:2.Coverage and Scope: The main questions you will need to broach in the dimension of the coverage and scope of your supplemental interpretability approach are: To what extent does our interpretability approach cover the explanation of singe predictions or classifications _of the model and to what extent does it cover the explanation of the _innerworkings and rationale of the model as a whole and across predictions? To what extent does it cover both?This distinction between single instance and total model explanation is often characterised as the difference between local interpretability and the global interpretability. Both types of explanation offer potentially helpful support for the provision of significant information about the rationale behind an algorithmic decision or behaviour, but both, in their own ways, also face difficulties.Local Interpretability: A local semantic explanation aims to enable the interpretability of individual cases. The general idea behind attempts to explain a ‘black box’ system in terms of specific instances is that, regardless of how complex the architecture or decision function of that system may be, it is possible to gain interpretive insight into its innerworkings by focusing on single data points or neighbourhoods in its feature space. In other words, even if the high dimensionality and curviness of a model makes it opaque as a whole, there is an expectation that insight-generating interpretable methods can be applied locally _to smaller sections of the model, where changes in isolated or grouped variables are more manageable and understandable.This general explanatory perspective has yielded several different interpretivestrategies that have been successfully applied in significant areas of ‘black box’ machine learning. One family of such strategies has zeroed in on neural networks (DNNs, in particular) by identifying what features of an input vector’s data points make it representative of the target concept that a given model is trying to classify. So, for example, if we have a digital image of a dog that is converted into a vector of pixel values and then processed it through a dog-classifying deep neural net, this interpretive approach will endeavour to tell us why the system yielded a ‘dog- positive’ output by isolating the slices of this set of data points that are most relevant to its successful classification by the model.This can be accomplished in several related ways. What is called sensitivity analysis identifies the most relevant features of an input vector by calculating local gradients to determine how a data point has to be moved to change the output label. Here, an output’s sensitivity to such changes in input values identifies the most relevant features. Another method to identify feature relevance that is downstream from sensitivity analysis is called salience mapping, where a strategy of moving backward through the layers of a neural net graph allows for the mapping of patterns of high activation in the nodes and ultimately generates interpretable groupings of salient input variables that can be visually represented in a heat or pixel attribution map.A second local interpretive strategy also seeks to explain feature importance in a single prediction or classification by perturbing input variables. However, instead of using these nudges in the feature space to highlight areas of saliency, it uses them to prod the opaque model in the area around the relevant prediction, so that a supplemental interpretable model can be constructed which establishes the relative importance of features in the black box model’s output.The most well-known example of this strategy is called LIME (Local Interpretable Model-Agnostic Explanation). LIME works by fitting an interpretable model to a specific prediction or classification produced by the opaque system of concern. It does this by sampling data points at random around the target prediction or classification and then using them to build a local approximation of the decision boundary that can account for the features which figure prominently in the specific prediction or classification under scrutiny.The way this works is relatively uncomplicated: LIME generates a simple linear regression model by weighting the values of the data points, which were produced by randomly perturbing the opaque model, according to their proximity to the original prediction or classification. The closest of these values to the instance being explained are weighted the heaviest, so that the supplemental model can produce an explanation of feature importance that is locally faithful to that instance. Note that the type of model that LIME uses most prominently is a sparse linear regression function for reasons of semantic transparency that were discussed above. Other interpretable models such as decision trees can likewise be employed.While LIME does indeed appear to be a step in the right direction for the future of interpretable AI, a host of issues that present challenges to the approach remains unresolved. For instance, the crucial aspect of how to properly define the proximity measure for the ‘neighbourhood’ or ‘local region’ where the explanation applies remains unclear, and small changes in the scale of the chosen measure can lead to greatly diverging explanations. Likewise, the explanation produced by the supplemental linear model can quickly become unreliable even with small and virtually unnoticeable perturbations of the system it is attempting to approximate. This challenges the basic assumption that that there is always some simplified linear model that successfully approximates the underlying model reasonably well near any given data point.LIME’s creators have largely acknowledged these shortcomings and have recently offered a new explanatory approach that they call ‘anchors’. These ‘high precision rules’ incorporate into their formal structures ‘reasonable patterns’ that areoperating within the underlying model (such as the implicit linguistic conventions that are at work in a sentiment prediction model), so that they can establish suitable and faithful boundaries of their explanatory coverage of its predictions or classifications.A related and equally significant local interpretive strategy is called SHAP (Shapley Additive exPlanations). SHAP uses concepts from game theory to define a ‘Shapley value’ for a feature of concern that provides a measurement of its influence on the underlying model’s prediction. Broadly, this value is calculated for a feature byaveraging its marginal contribution to _every possible prediction _for the instance under consideration.This might seem impossible, but the strategy is straightforward. SHAP calculates the marginal contribution of the relevant feature for all possible combinations of inputs in the feature space of the instance. So, if the opaque model that it is explaining has 15 features, SHAP would calculate the marginal contribution of the feature under consideration 32,768 times (i.e. one calculation for each combination of all possible combinations of features: 215, or 2k when _k = _15).This method then allows SHAP to estimate the Shapley values for all input features in the set to produce the complete distribution of the prediction for the instance. In our example, this would entail 491,520 calculations. While such a procedure is computationally burdensome and becomes intractable beyond a certain threshold, this means that _locally, that is, for the calculation of the specific instance, SHAP can axiomatically guarantee the consistency and accuracy of its reckoning of the marginal effect of the feature. (Note that the SHAP platform does offer methods of approximation to avoid this excessive computational expense.)Despite this calculational robustness, SHAP also faces some of the same kinds of difficulties that LIME does. The way SHAP calculates marginal contributions is by constructing two instances: the first instance includes the feature being measured while the second leaves it out. After calculating the prediction for each of these instances by plugging their values into the underlying model, the result of the second is subtracted from that of the first to determine the marginal contribution of the feature. This procedure is then repeated for all possible combinations of features so that the weighted average of all of the marginal contributions of the feature of concern can be computed.The contestable part of this process comes with how SHAP defines the absence _of variables under consideration. To leave out a feature—whether it’s the one being directly measured or one of the others not included in the combination under consideration—SHAP replaces it with a _stand-in feature value _drawn from a selected donor sample (that is itself drawn from the existing dataset). This method of sampling values assumes feature independence (i.e. that values sampled are not correlated in ways that might significantly affect the output for a particular calculation). As a consequence, the interaction effects engendered by and between stand-in variables are necessarily unaccounted for when conditional contributions are approximated.The result is the introduction of uncertainty into the explanation that is produced because the complexity of multivariate interactions in the underlying model may not be sufficiently captured by the simplicity of this supplemental interpretability technique. This drawback in sampling (as well as a certain degree of arbitrariness in domain definition) can cause SHAP to become unreliable even with minimal perturbations of the model it is approximating.Despite these limitations in the existing tools of local interpretability, it is important that you think ‘local-first’ when considering the issue of the coverage and scope of the explanatory approaches you plan to incorporate into your project. Being able to provide explanations of specific predictions and classifications is of paramountimportance both to securing optimal outcome transparency and also to ensuring that your AI system will be implemented responsibly and reasonably.Global interpretability: The motivation behind the creation of local interpretability tools like LIME or SHAP (as well as many others not mentioned here) has derived, at least in part, from a need to find a way of avoiding the kind of difficult _double bind _faced by the alternative approach to the coverage and scope of interpretable AI: global interpretability.On the prevailing view, providing a global explanation of a ‘black box’ model entails offering an alternative interpretable model that captures the innerworkings and logic of a ‘black box’ model _in sum _and across predictions or classifications. The difficulty faced by global interpretability arises in the seemingly unavoidable trade-off between the need for the global explanatory model to be sufficiently simple so that it is understandable by humans and the need for that model to be sufficiently complex so that it can capture the intricacies of how the mapping function of a ‘black box’ model works as a whole. While this is clearly a real problem that appears to be theoretically inevitable, it is important to keep in mind that, _from a practical standpoint, a serviceable notion of global interpretability need not be limited to such a conceptual puzzle. There are at least two less ambitious but more constructive ways to view global interpretability as a potentially meaningful contributor to the responsible design and implementation of interpretable AI.First, many useful attempts have already been made at building explanatory models that employ interpretable methods (like decision trees, rule lists, and case-based classification) to globally approximate neural nets, tree ensembles, and support vector machines. These results have enabled a deeper understanding of the way human interpretable logics and conventions (like if-then rules and representationally generated prototypes) can be measured against or mapped onto high dimensional computational structures and even allow for some degree of targeted comprehensibility of the logic of their parts.This capacity to ‘peek into the black box’ is of great practical importance in domains where trust, user-confidence, and public acceptance are critical for the realisation optimal outcomes. Moreover, this ability to move back and forth between interpretable architectures and high-dimensional processing structures can enable knowledge discovery as well as insights into the kinds of dataset-level and population-level patterns, which are crucial for well-informed macroscale decision- making in areas ranging from public health and economics to the science of climate change.Being able to uncover global effects and relationships between complex model behaviour and data distributions at the demographic and ecological level may prove vital for establishing valuable and practically useful knowledge about unobservable but significant biophysical and social configurations. Hence, although these models have not solved the understandability-complexity puzzle as such, they have opened up new pathways for innovative thinking in the applied data sciences that may be of immense public benefit in the future.Secondly, as mentioned above, under the auspices of the aspiration to engineering insight, a descriptive and analytical kind of global interpretability _can be seen as a driving force of data scientific advancement. When seen through a practitioner- centred lens, this sort of global interpretability allows data scientists to take a wide- angled and discovery-oriented view of a ‘black box’ model’s relationship to patterns that arise across the range of its predictions. Figuring out how an opaque system works and how to make it work better by more fully understanding these patterns is a continuous priority of good research. So too is understanding the relevance of features and of their complex interactions through dataset level measurement and analysis. These dimensions of incorporating the explanatory aspirations of global interpretability into best practices of research and innovation should be encouraged in your AI project. (3) Formulate an interpretability action plan: The final step you will need to take to ensure aresponsible approach to using ‘black box’ models is to formulate an interpretability action plan so that you and your team can put adequate forethought into how explanations of the outcomes of your system’s decisions, behaviours, or problem-solving tasks can be optimally provided to users, decision subjects, and other affected parties.This action plan should include the following:·A clear articulation of the explanatory strategies your team intends to use and a detailed plan that indicates the stages in the project workflow when the design and development of these strategies will need to take place.·A succinct formulation of your explanation delivery strategy, which addresses the special provisions for clear, simple, and user-centred explication that are called for whensupplemental interpretability tools for ‘black box’ models are utilised. See more about delivery and implementation in Guideline
1. ·A detailed timeframe for evaluating your team’s progress in executing its interpretability action plan and a role responsibility list, which maps in detail the various task-specific responsibilities that will need to be fulfilled to execute the plan. Guideline 4: Think about interpretability in terms of the capacities of human understandingWhen you begin to deliberate about the specific scope and content of your interpretability platform, it is important to reflect on what it is that you are exactly aiming to do in making your model sufficiently interpretable. A good initial step to take in this process is to think about what makes even the simplest explanations clear and understandable. In other words, you should begin by thinking about interpretability in terms of the capacities and limitations of human cognition.From this perspective, it becomes apparent that even the most straightforward model like a linear regression function or a decision tree can become uninterpretable when its dimensionality presses beyond the cognitive limits of a thinking human. Recall our example of the simple linear regression: 𝑦 = 𝑎 + 𝑏𝑥 + 𝜖. In this instance, only one feature _x _relates to the response variable _y, so understanding the predictive relationship is easy. The model is parsimonious.However, if we started to add more features as covariates, even though the model would remain linear and hence intuitively predictable, being able to understand the relationship between the response variable and all the predictors and their coefficients (feature weights) would quickly become difficult. So, say we added ten thousand features and trained the model: 𝑦 = 𝑎 + 𝑏0𝑥0 +𝑏1𝑥1 + + 𝑏10000𝑥10000 + 𝜖. Understanding how _this model’s prediction comes about—what role each of the individual parts play in producing the prediction—would become difficult because of a certain cognitive limit in the quantity of entities that human thinking can handle at any given time. This model would lose a significant degree of interpretability.Seeing interpretability as a continuum of comprehensibility that is dependent on the capacities and limits of the individual human interpreter should key you in to what is needed in order todeliver an interpretable AI system. Such limits to consider should include not only cognitive boundaries but also varying levels of access to relevant vocabularies of explanation; an explanation about the results of a trained model that uses a support vector machine to divide a 26-dimensional feature space with a planar separator, for instance, may be easy to understand for a technical operator or auditor but entirely inaccessible to a non-specialist. Offering good explanations should take expertise level into account. Your interpretability platform should be cognitively equitable."(Leslie, 2019, p.40-53)

## IEEE report"for transparency from implementation to deployment

## BackgroundWhen A/IS become part of social communities and behave according to the norms of their communities, people will want to understand the A/IS decisions and actions, just as they want to understand each other’s decisions and actions. This is particularly true for morally significant actions or omissions: an ethical reasoning system should be able to explain its own reasoning to a user on request. Thus, transparency, or “explainability”, of A/IS is paramount (Chaudhuri 201727; Wachter, Mittelstadt, and Floridi 201728), and it will allow a community to understand, predict, and modify the A/IS (see Section 1, Issue 2; for a nuanced discussion see Selbst and Barocas29). Moreover, as the norms embedded in A/IS are continuously updated and refined (see Section 1, Issue 2), transparency allows for appropriate trust to be developed (Grodzinsky, Miller, and Wolf 201130), and, where necessary, allows the community to modify a system’s norms, reasoning, and behavior.Transparency can occur at multiple levels, e.g., ordinary language or coder verification, and for multiple stakeholders, e.g., user, engineer, and attorney. (See IEEE P7001™, IEEE Standards Project for Transparency of Autonomous Systems). It should be noted that transparency to all parties may not always be advisable, such as in the case of security programs that prevent a system from being hacked (Kroll et al. 201631). Here we briefly illustrate the broad range of transparency by reference to four ways in which systems can be transparent—traceability, verifiability, honest design, and intelligibility—and apply these considerations to the implementation of norms in A/IS.Transparency as traceability—Most relevant for the topic of implementation is the transparency of the software engineering process during implementation (Cleland-Huang, Gotel, and Zisman201232). It allows for the originally identified norms (Section 1, Issue 1) to be traced through to the final system. This allows technical inspection of which norms have been implemented, for which contexts, and how norm conflicts are resolved, e.g., priority weights given to different norms. Transparency in the implementation process may also reveal biases that were inadvertently built into systems, such as racism and sexism, in search engine algorithms (Noble 201333). (See Section 3, Issue
1. ) Such traceability in turn calibrates a community’s trust about whether A/IS are conforming to the norms and values relevant in their use contexts (Fleischmann and Wallace 200534).Transparency as verifiability—Transparency concerning how normative reasoning is approached in the implementation is important as we wish to verify that the normative decisions the system makes match the required norms and values. Explicit and exact representations of these normative decisions can then provide the basis for a range of strong mathematical techniques, such as formal verification (Fisher, Dennis, and Webster 201335). Even if a system cannot explain every single reasoning step in understandable human terms, a log of ethical reasoning should be available for inspection of later evaluation purposes (Hind et al. 201836).Transparency as honest design—German designer Dieter Rams coined the term “honest design” to refer to design that “does not make a product more innovative, powerful or valuable than it really is” (Vitsoe 201837; see also Donelli 201538; Jong 201739). Honest design of A/IS is one aspect of their transparency, because it allows the user to “see through” the outward appearance and accurately infer the A/IS’ actual capacities. At times, however, the physical appearance of a system does not accurately represent what the system is capable of doing—e.g., the agent displays signs of a certain human-like emotion but its internal state does not represent that human emotion. Humans are quick to make strong inferences from outward appearances of human-likeness to the mental and social capacities the A/IS might have. Demands for transparency in design therefore put a responsibility on the designer to “not attempt to manipulate the consumer with promises that cannot be kept” (Vitsoe 201840).Transparency as intelligibility_—As mentioned above, humans will want to understand theA/IS’ decisions and actions, especially the morally significant ones. A clear requirement for an ethical A/IS is that the system be able to explain its own reasoning to a user, when asked—or, ideally, also when suspecting the user’s confusion, and the system should do so at a level of ordinary human reasoning, not with incomprehensible technical detail (Tintarev and Kutlak 201441). Furthermore, when the system cannot explain some of its actions, technicians or designers should be available to make those actions intelligible. Along these lines, the European Union’s General Data Protection Regulation (GDPR), in effect since May 2018, states that, for automated decisions based on personal data, individuals have a right to “an explanation of the [algorithmic] decision reached after such assessment and to challenge the decision”. (See boyd [sic] 201642, for a critical discussion of this regulation.)

## RecommendationA/IS, especially those with embedded norms, must have a high level of transparency, shown as traceability in the implementation process, mathematical verifiability of their reasoning, honesty in appearance-based signals,and intelligibility of the systems’ operationand decisions.

## Further Resources

d. boyd, “Transparency Accountability.”Data & Society: Points, November 29,
A. Chaudhuri, “ Philosophical Dimensions of Information and Ethics in the Internet of Things (IoT) Technology,”The EDP Audit, Control, and SecurityNewsletter, vol. 56, no. 4, pp. 7-18, DOI:
1080/
1380474,
J. Cleland-Huang, O. Gotel, and A. Zisman, eds. Software and Systems Traceability. London: Springer,
doi:
1007/978-1-4471-2239-5
G. Donelli, “Good design is honest.” (blog). March 13,
Accessed Oct 22,
https://blog.astropad.com/good-design-ishonest/
M. Fisher, L. A. Dennis, and M. P. Webster. “Verifying Autonomous Systems.” Communications of the ACM, vol. 56,no. 9, pp. 84–93,
K. R. Fleischmann and W. A. Wallace. “ACovenant with Transparency: Opening the Black Box of Models.” Communications of the ACM, vol. 48, no. 5, pp. 93–97,
F. S. Grodzinsky, K. W. Miller, and M. J. Wolf.“Developing Artificial Agents Worthy of Trust:Would You Buy a Used Car from This Artificial Agent?” Ethics and Information Technology, vol. 13, pp. 17–27,
M. Hind, et al. “Increasing Trust in AI Services through Supplier’s Declarations of Conformity.” ArXiv E-Prints, Aug.
[Online] Available: [https://arxiv.org/abs/
](https://arxiv.org/abs/
07261) [Accessed October 28, 2018].
C. W. De Jong, ed., Dieter Rams: Ten Principles for Good Design. New York, NY: Prestel Publishing,
J. A. Kroll, J. Huey, S. Barocas et al. “Accountable Algorithms.” University of Pennsylvania Law Review 165
S. U. Noble, “Google Search: Hyper-Visibility as a Means of Rendering Black Women and Girls Invisible.” InVisible Culture 19,
D. Selbst and S. Barocas, “The Intuitive Appeal of Explainable Machines,” 87Fordham Law Review 1085, Available at SSRN: https:// ssrn.com/abstract=3126971 or [http://dx.doi.org/
2139/ssrn.3126971,](http://dx.doi.org/
2139/ssrn.3126971) Feb. 19,
N. Tintarev and R. Kutlak. “Demo: Making Plans Scrutable with Argumentation and Natural Language Generation.” _Proceedings of the Companion Publication of the 19th International Conference on Intelligent User Interfaces, _pp. 29–32,
Vitsoe. “The Power of Good Design.” Vitsoe,
Retrieved Oct 22, 2018 from https:// www.vitsoe.com/us/about/good-design.
S.Wachter, B. Mittelstadt, and L. Floridi, “Transparent, Explainable, and Accountable AI for Robotics.” Science Robotics, vol. 2, no. 6, eaan
doi:
1126/scirobotics. aan6080,
"p.177-179

Challenge Instances How do we provide consistent oversight of AI to ensure they are accountable to end-users for any conclusions? Decisions with unclear grounding

Overarching Principles Respect for persons

Principles Respect for persons

Sources Turing responsible design and implementation of AI systems in the public sector IEEE

Created At 2023-05-19T11:54:47.000Z

Title Principle of transparency, key considerations