COURSE DESCRIPTION
Concern about the harmful effects of machine learning algorithms and big data AI models (bias and more) has resulted in greater attention to the fundamentals of data ethics. News stories appear regularly about credit algorithms that discriminate against women, medical algorithms that discriminate against African Americans, hiring algorithms that base decisions on gender, and more. In most cases, the data scientists who developed and deployed these decision making algorithms and data processes had no such intentions, and were unaware of the harmful impact of their work.
This data science ethics course, the second in the data science ethics program for both practitioners and managers, provides guidance and practical tools to build better models, do better data analysis and avoid these problems. You’ll learn about ****
- Tools for model interpretability
- Global versus local model interpretability methods
- Metrics for model fairness
- Auditing your model for bias and fairness
- Remedies for biased models
The course offers real world problems and datasets, a framework data scientists can use to develop their projects, and an audit process to follow in reviewing them. Case studies with ethical considerations, along with Python code, are provided.
LEARNING OUTCOMES
- How to evaluate predictor impact in black box models using interpretability methods
- How to explain the average contribution of features to predictions and the contribution of individual feature values to individual predictions
- How to Assess the performance of models with metrics to measure bias and unfairness
- How to describe potential ethical issues that can arise with image and text data, and how to address them
- How to donduct an audit of a data science project from an ethical standpoint to identify possible harms and potential areas for bias mitigation or harm reduction
In this course we will mostly be addressing things the data scientist can do to ensure that their projects and solutions are designed and implemented responsibly. We will primarily focus on issues of bias and unfairness across protected groups.
Syllabus
This course is arranged in 4 modules. We estimate that you will need to spend at least 5 hours per week. The course is self-paced, so you have the flexibility to complete modules in your own time. ****
Week 1 – Audit and Remediation
- Videos:
- Introduction
- Audit and Remediation
- Confusion Matrix
- Beyond Classic Bias
- Regression
- Knowledge Checks
- Lab 1 (for verified users only)
- Discussion Prompt (for verified users only)
Week 2 – Interpretability in Practice
- Videos:
- Interpretability
- Global Interpretability
- Fidelity, Robustness, Caveats
- Local Interpretability Methods
- Knowledge Checks
- Reading
- Lab 2 (for verified users only)
- Discussion Prompt (for verified users only)
Week 3 – Image and Text Data
- Videos:
- Image and Text Data
- Neural Net Interpretability
- Knowledge Checks
- Readings
- Lab 3 (for verified users only) – will need gmail account for this lab
- Discussion Prompt (for verified users only)
Week 4 – Tools and Documentation
- Videos:
- Tools and Documentation
- Readings
- Knowledge Checks
- Quiz (for verified users only)
Please note:
- There are 4 modules in total.
- Labs are for verified users only. They are ‘open book’ and there is no set time limit. You will need a gmail account for the lab on Colab (Colabatory on Google) for Week 3.
- The exercises involve hands-on work with Python (we will provide useful hints)
- You will only have one attempt to answer each exercise.
- You can complete the exercises at any time while the course is open, however, we do recommend that you complete them sequentially, after you complete the relevant module.
Course Features
- Lectures 0
- Quizzes 0
- Duration 4 weeks
- Skill level All levels
- Language English
- Students 0
- Assessments Yes