Algorithms against linguistic insecurity

Language psychology

Or against ‘non-prestigious’ languages?

Photo by Wayhomestudio Photos on Freepik
Photo by Wayhomestudio Photos on Freepik

Linguistic insecurity is a problem primarily faced by members of minority dialects. Their speech is often full of elements that are different or absent in the literary language. For example, the pronunciation of the letter G in Southern Russian and Ukrainian is not always seen as acceptable. If a person having this kind of accent wants to work in radio or television, they are likely to make a special effort to correct their pronunciation.

Talking about English, a number of such issues affect in particular African-American English (aka African-American Vernacular English, Black English or my favourite — Ebonics), a variety of English combining features of dialect and sociolect (speech common to a specific social group, such as a profession, subculture, etc.). AAVE is a mixture of English vocabulary and grammar elements of Niger-Congolese languages that faces one of the greatest social misunderstandings. In schools, children from African American families may be perceived as 'developmentally backward' as the grammar of AAVE has several differences from standard English, which may be mistaken for illiterate errors easily. Another example is that African-Americans seeking psychological help may be embarrassed to speak 'wrong' English, and a medical professional will perceive such behaviour as reticence. The list goes on, but here we will conclude with an eloquent scene from the film Sorry to Bother You, separately devoted to the benefits of "white" pronunciation.

What do the algorithms mentioned in the title have to do with it? In this age of concerns over fake news and online insults, moderation is becoming a kind of art. And, as you've probably already guessed, not all dialectal differences pass it. Back to African-American English: some time ago, the Centre for Civic Media at MIT launched a social media aggregator called Gobo. Gobo was supposed to filter social media posts by a few categories and exclude those that contained, for example, insults to certain parts of the world's population or obscurantist statements such as the promotion of homoeopathy. The user could adjust the filters to their preferences but see the excluded posts on request to get an explanation as to why the filter didn't let them through.

Like many other big products at the beginning of their journey, Gobo immediately demonstrated some imperfections, such as the failure to distinguish between intrusive brands and NGO. As for AAVE, an exposé of its filtering is published on the Centre for Civic Media's page, accompanied by a few screenshots. The poor algorithm simply failed to realise that the word f*** is rich not only with doubtful connotations but also expresses an excess of admiration for one's dear. A tiny offtopic into Russian — we have about the same. In my humble opinion, a gloomier illustration is the investigation into the toxicity of the anti-toxicity service Perspective API by Twitter user @Jessamyn. The service rated mentions of women (especially minor ones) as more toxic as well as those of hidden impairments (such as deafness). Does this suggest a global linguistic conspiracy against those who are different and represent a minority (well, the female sex is not that much of a minority)? I hope not.