Computer scientists from Princeton and Stanford University are working to address problems of bias in Artificial Intelligence. For that, they have built methods to gain fairer data sets containing images of people. The researchers work closely with ImageNet, a database of over 14 million images that has assisted in advancing computer vision over the past decade.
ImageNet, an image database, comprises images of objects, landscapes and people. It serves as a source of training data for researchers who create machine learning algorithms that classify images. Its unprecedented scale required automated image collection and crowd-sourced image annotation. As the database’s person categories have rarely been leveraged by the research community, the ImageNet team works to address biases and other issues about images featuring people that are unintended consequences of ImageNet’s construction.
According to Olga Russakovsky, an assistant professor of computer science at Princeton, computer vision now works really well, which means it’s being deployed all over the place in all kinds of contexts. “This means that now is the time for talking about what kind of impact it’s having on the world and thinking about these kinds of fairness issues,” she said.
In new research, the ImageNet team systematically classified non-visual concepts and offensive categories, including racial and sexual characterizations, and the team proposed removing them from the database. They have also developed a tool to enable users to stipulate and retrieve image sets of people that are balanced by age, gender expression, and skin color. The purpose behind this is to develop algorithms that can more fairly classify people’s faces and activities in images.
The researchers presented their work in late January this year at the Association for Computing Machinery’s Conference on Fairness, Accountability and Transparency in Barcelona, Spain.
Olga Russakovsky said, “There is very much a need for researchers and labs with core technical expertise in this to engage in these kinds of conversations.” She further added that “Given the reality that we need to collect the data at scale, given the reality that it’s going to be done with crowdsourcing because that’s the most efficient and well-established pipeline, how do we do that in a way that’s fairer – that doesn’t fall into these kinds of prior pitfalls? The core message of this paper is around constructive solutions.”
In 2009, ImageNet was launched by a group of computer scientists at Princeton and Stanford as a source for academic researchers and educators. The creation of the system was led by Princeton alumni and faculty member Fei-Fei Li. To reinforce researchers to create better computer vision algorithms using ImageNet, the scientists also developed the ImageNet Large Scale Visual Recognition Challenge, which focused largely on object recognition using 1,000 image categories, where just three of which featured people.