Season 1 Ep. 2 Dr. Olga Russakovsky explains AI data's diversity problem
Summary

In this insightful episode of Dr. Pawd, the guest speaker, Dr. Olga Russakovsky, a Princeton University professor, discusses the critical issue of biased data in AI systems and its effect on decision-making. Dr. Russakovsky emphasizes that biases in AI have real-world consequences, affecting areas such as the criminal justice system and loan prediction models. Addressing these biases can lead to improved AI systems, large-scale deployment of solutions, and potentially eliminate undesirable human reactions.

The episode reveals the existence of geographical bias in visual data sets, with examples of object detectors failing to recognize bar soap from countries outside the U.S. Dr. Russakovsky and her team have developed a tool, Revised, to uncover biases in visual data sets and suggest interventions either to improve the data or adjust the AI algorithm. Another topic discussed is the crucial need for diverse voices in AI research to address potential biases and underrepresentation. For instance, face recognition systems often perform poorly on people with darker skin and women.

Dr. Russakovsky shares her personal experiences with biases in AI systems and her co-founding of AI4All, a foundation aiming to bring diversity into AI leadership by focusing on high school students with leadership potential. AI4All partners with multiple universities and offers focused summer programs on various aspects of AI such as AI Technology and Policy, AI Governance, and AI Safety. Accessibility for students is prioritized in these programs, offering free tuition and additional subsidies if needed. Students who have participated in these programs continue to share their knowledge, contribute to AI research, and make an impact in the field.

The importance of diversity and inclusion in AI research is further discussed, arguing that increasing demographic diversity is not enough; there is a need for genuinely inclusive environments. Dr. Russakovsky shares her efforts to promote inclusion within her lab, including surveys for lab members and distributed leadership roles.

Finally, the episode highlights the ImageNet Competition, which measures the progress of visual recognition by comparing the performance of different algorithms on the same test set. The organization and management of the competition are discussed, as well as the importance of the test set, the process of crowdsourcing labels for images in the dataset, and the utilization of crowd workers for faster annotation. This captivating episode provides a comprehensive perspective on addressing bias and promoting diversity in AI, contributing to the development of more accurate and inclusive AI systems.