Invisible Women: Data Bias in a World Designed for Men

In Muriel Rukeyser’s poem ‘Myth’, an old, blind Oedipus asks the Sphinx, ‘Why didn’t I recognize my mother?’ The Sphinx replies that Oedipus answered her question (what walks on four legs in the morning, two in the afternoon and three in the evening) incorrectly. ‘[Y]ou answered, Man. You didn’t say anything about woman.’ But, replies Oedipus, when you say man, ‘you include women too. Everyone knows that.’

But in fact the Sphinx was right and Oedipus is wrong. When you say man you don’t ‘include women too’, even if everyone does technically ‘know that’. Numerous studies in a variety of languages over the past forty years have consistently found that what is called the ‘generic masculine’ (using words like ‘he’ in a gender-neutral way) is not in fact read generically.20 It is read overwhelmingly as male.

When the generic masculine is used people are more likely to recall famous men than famous women;21 to estimate a profession as male-dominated;22 to suggest male candidates for jobs and political appointments.23 Women are also less likely to apply, and less likely to perform well in interviews, for jobs that are advertised using the generic masculine.24 In fact the generic masculine is read so overwhelmingly as male that it even overrides otherwise powerful stereotypes, so that professions such as ‘beautician’, which are usually stereotyped female, are suddenly seen as male.25 It even distorts scientific studies, creating a kind of meta gender data gap: a 2015 paper looking at self-report bias in psychological studies found that the use of the generic masculine in questionnaires affected women’s responses, potentially distorting ‘the meaning of test scores’.26 The authors concluded that its use ‘may portray unreal differences between women and men, which would not appear in the gender-neutral form or in natural gender language versions of the same questionnaire’.

And yet in the face of decades of evidence that the generic masculine is anything but clear, official language policy in many countries continues to insist that it is purely a formality whose use must continue for the sake of . . . clarity. As recently as 2017, the Académie fran?aise, France’s ultimate authority on the French language, was thundering against ‘the aberration of “inclusive writing”’, claiming that ‘the French language finds itself in mortal danger’ from workarounds for the generic masculine. Other countries including Spain27 and Israel28 have faced similar rows.

Because English is not a grammatically gendered language, the generic masculine is fairly restricted in modern usage. Terms like ‘doctor’ and ‘poet’ used to be generic masculine (with specifically female doctors and poets referred to – usually derisively – as poetesses and doctoresses), but are now considered gender neutral. But while the formal use of the generic masculine only really clings on in the writings of pedants who still insist on using ‘he’ to mean ‘he or she’, it has made something of a comeback in the informal usage of Americanisms such as ‘dude’ and ‘guys’, and, in the UK, ‘lads’ as supposedly gender-neutral terms. A recent row in the UK also showed that, for some, male default still matters an awful lot: when in 2017 the first female head of London’s Fire Brigade, Dany Cotton, suggested that we should replace ‘fireman’ with the now standard (and let’s face it, much cooler) ‘firefighter’, she received a deluge of hate mail.29

Languages such as French, German and Spanish, however, are what is called ‘gender-inflected’, and here the concept of masculine and feminine is woven into the language itself. All nouns are gendered either masculine or feminine. A table is feminine, but a car is masculine: la mesa roja (the red table); el coche rojo (the red car). When it comes to nouns that refer to people, while both male and female terms exist, the standard gender is always masculine. Try searching Google for ‘lawyer’ in German. It comes back ‘Anwalt’, which literally means male lawyer, but is also used generically as just ‘lawyer’. If you want to refer to a female lawyer specifically you would say ‘Anw?ltin’ (incidentally, the way female terms are often, as here, modified male terms is another subtle way we position the female as a deviation from male type – as, in de Beauvoir’s terms, ‘Other’). The generic masculine is also used when referring to groups of people: when the gender is unknown, or if it’s a mixed group the generic masculine is used. So a group of one hundred female teachers in Spanish would be referred to as ‘las profesoras’ – but as soon as you add a single male teacher, the group suddenly becomes ‘los profesores’. Such is the power of the default male.

In gender-inflected languages the generic masculine remains pervasive. Job vacancies are still often announced with masculine forms – particularly if they are for leadership roles.30 A recent Austrian study of the language used in leadership jobs ads found a 27:1 ratio of masculine to ‘gender-fair forms’ (using both the male and female term).31 The European Parliament believes it has found a solution to this problem, and since 2008 has recommended that ‘(m/f)’ be added on the end of job ads in gender-inflected languages. The idea is that this makes the generic masculine more ‘fair’ by reminding us that women exist. It’s a nice idea – but it wasn’t backed up by data. When researchers did test its impact they found that it made no difference to the exclusionary impact of using the generic masculine on its own – illustrating the importance of collecting data and then creating policy.32

Does all this arguing over words make any real world difference? Arguably, yes. In 2012, a World Economic Forum analysis found that countries with gender-inflected languages, which have strong ideas of masculine and feminine present in almost every utterance, are the most unequal in terms of gender. 33 But here’s an interesting quirk: countries with genderless languages (such as Hungarian and Finnish) are not the most equal. Instead, that honour belongs to a third group, countries with ‘natural gender languages’ such as English. These languages allow gender to be marked (female teacher, male nurse) but largely don’t encode it into the words themselves. The study authors suggested that if you can’t mark gender in any way you can’t ‘correct’ the hidden bias in a language by emphasising ‘women’s presence in the world’. In short: because men go without saying, it matters when women literally can’t get said at all.

It’s tempting to think that the male bias that is embedded in language is simply a relic of more regressive times, but the evidence does not point that way. The world’s ‘fastest-growing language’,34 used by more than 90% of the world’s online population, is emoji.35 This language originated in Japan in the 1980s and women are its heaviest users:36 78% of women versus 60% of men frequently use emoji.37 And yet, until 2016, the world of emojis was curiously male.

The emojis we have on our smartphones are chosen by the rather grand-sounding ‘Unicode Consortium’, a Silicon Valley-based group of organisations that work together to ensure universal, international software standards. If Unicode decides a particular emoji (say ‘spy’) should be added to the current stable, they will decide on the code that should be used. Each phone manufacturer (or platform such as Twitter and Facebook) will then design their own interpretation of what a ‘spy’ looks like. But they will all use the same code, so that when users communicate between different platforms, they are broadly all saying the same thing. An emoji face with heart eyes is an emoji face with heart eyes.

Caroline Criado Perez's books