Is AI sexist? Only as much as the data that goes into it

It's important to check biases of those entering the data

DUBAI , UNITED ARAB EMIRATES , OCT 10   – 2017 :- Visitors talking to social robot called ‘Sophia’ which is on display at the Etisalat stand during the GITEX Technology Week held at Dubai World Trade Centre in Dubai. (Pawan Singh / The National ) Story by Nicholas Webster
Powered by automated translation

Too many of our public conversations over the past few years have become polarised. Take any topic, be it fake news, conspiracy theories or artificial intelligence (AI). We might think that we can retreat from culture wars to a safer, more neutral place where subjectivity, privilege and vested interests have no place but we would be wrong.

While around the world women are protesting for more rights, more safety, better health, pay, representation and so on, in a more digital, less tangible space, machines, technology and AI are reinforcing some of the problems women face. In fact, in some cases, when it comes to sexism, AI is exacerbating the problem.

Take the example of an experiment run by Dora Vargha, senior lecturer in Medical Humanities at the University of Exeter in the UK, the results of which she posted on Twitter last week. She used Google to translate phrases from Hungarian, a gender-neutral language, to English. The output read like a 1950s gender roles text book: “She is beautiful. He is clever. He reads. She washes the dishes. He builds. She sews. He teaches. She cooks. He’s researching. She’s raising a child,” said the translation.

As Ms Vargha explained, since Hungarian has no gendered pronouns, Google Translate chooses the gender for you. “Here is how everyday sexism is consistently encoded in 2021," she added.

The problem was with the data. The algorithm seeks out words and references to which they are most associated within the existing data. So the gender stereotypes that exist within the data is what informs the algorithm, which then produces results such as the stereotypical gender pronouns.

When data is created, the male pronoun is considered the norm, which means that outputs relating to women are non-existent, or worse inappropriate or dangerous.

For example, according to a 2011 study by the University of Virginia Centre for Applied Biomechanics, women were allegedly less likely to be in a car accident than men, but 47 per cent more likely to be seriously injured. A newer paper published by the same university in 2019 though showed a reduced gender risk and reported that the risk for women in car crashes was now as high as 73 per cent. One theory to explain this wide gap in results is that till 2003 in the US, only male car crash test dummies were used. Even now, only a five feet tall female dummy that weighs 110 pounds (50kg) is used, which researchers believe is not an accurate representation of female bodies.

Since current data is built on social bias and stereotypes, that is what AI perpetuates

Again, a lot of these issues stem from the data that is being entered. AI uses existing data to feed into the model upon which it makes its predictions. And since current data is built on social bias and stereotypes, that is what it perpetuates. Sexism is being woven into the fabric of our future in these ways, while people are under the misapprehension that with time we are freeing ourselves from sexism of the past. It is a dangerous paradox, and one which could lead to complacency that things have been solved when, in fact, they have worsened.

Take another example. A study at the University of Melbourne this year found that AI used for recruitment could discriminate against women because of the criteria used to identify "good" candidates – that is, women in continuous employment rather than those who take career breaks. Due to maternity laws in Australia and existing social structures, women are more likely to take career breaks for the sake of raising children. That reality is, of course, not a verdict on how valuable they were as employees before the break. But the quantification of realities such as maternity leave into data sets can easily discriminate, however unwittingly, against who is a good employee and who is not.

This leads to the second part of the discrimination built into AI – it is made up of both what is going into the system, as well as who is setting up the models and entrenching their own perspectives and biases into it. Which is to say, AI is only as good as the data that goes into it and Unesco says only 22 per cent of AI professionals worldwide are women. So it's not surprising that most of the data is fed by men.

Such facts make one wonder about why digital personal assistants are programmed on female templates – Apple's Siri, Google's Alexa and Microsoft's Cortana – the existence of which reinforces the obsolete notion that administration and secretarial tasks should be carried out only by women. A UN report in 2019, in fact, made the same point – that such female virtual assistants are reinforcing negative stereotypes about women.

When we think about the larger role that AI will go on to play in our societies, it is important to be aware of the biases that can proliferate due to what's fed into the system. The way to solve these problems is to be hawkish about the data that is entered, and to ensure that people who input that data, regardless of gender, are trained to check their own prejudices.

If we don’t do this, then a world free of sexism will remain a mere dream. Let's not squander a chance to make sure the world evolves in a balanced manner, free of all gender bias.

Shelina Janmohamed is an author and a culture columnist for The National