Toxic language detection is totally inadequate

Alphabet (nee Google) is working on Perspective, which is a machine learning algorithm that tries to detect “toxic” language.

Perspective’s got a ways to go, and some of the errors the program has been shown to make are enough to cause one to question if machines will ever be able to parse meaning-laden human expression with any accuracy. Bottom line: humans, in their ability to evade detection for nasty invective, are way, way ahead of the machines.

This is why our own Moderation team shines and why leaving it to computers without any people looks like it will continue to fail.

 the algorithm is a bit too dependent on hot-button keywords, and not enough on the surrounding contextual clues in the statement, especially a word like “not,” which tends to reverse the polarity of what’s being said. “Rape,” “Jews,” “terrorist,” and “Hitler” are all likely to increase your toxicity score, even in comments that are mostly placating or unobjectionable.

Auerbach supplies a hilarious account of the ways Perspective gets it wrong:

“Trump sucks” scored a colossal 96 percent, yet neo-Nazi codeword “14/88” only scored 5 percent. “Few Muslims are a terrorist threat” was 79 percent toxic, while “race war now” scored 24 percent. “Hitler was an anti-Semite” scored 70 percent, but “Hitler was not an anti-Semite” scored only 53%, and “The Holocaust never happened” scored only 21%. And while “gas the joos” scored 29 percent, rephrasing it to “Please gas the joos. Thank you.” lowered the score to a mere 7 percent. (“Jews are human,” however, scores 72 percent. “Jews are not human”? 64 percent.)

Humans are highly subtle when it comes to language, and machines find it hard to keep up. A particularly chilling example from the MIT Technology Review article is the sentence “You should be made into a lamp,” which is a direct allusion to Nazi atrocities and has been directed at several journalists in recent months. Perspective gives that a toxicity rating of 4.


It’s hard enough to parse language for hateful intent; imagine how much harder when you toss in a factor like juxtaposition with an image. A sentence like “You can trust me to do the right thing” has a completely different meaning when placed next to a picture of Pepe the Frog, wouldn’t you think?

A reader also put a few bible passages through, starting with the Ten Commandments:

And somehow having God’s blessing isn’t seen as a good thing either

When it comes to humans communicating with each other, only other humans can be relied on to truly know what the context is.  Especially when surrounded by other images or placed inside a location that has a meaning that is totally unknowable to the computer.

Want an example?    How about an ad for a self-cleaning oven photoshopped on the fence at Dachau?

Do you want:

  • ad-free access?
  • access to our very popular daily crossword?
  • access to Incite Politics magazine articles?

Silver subscriptions and above go in the draw to win a $500 prize to be drawn at the end of March.

Not yet one of our awesome subscribers? Click Here and join us.

As much at home writing editorials as being the subject of them, Cam has won awards, including the Canon Media Award for his work on the Len Brown/Bevan Chuang story.  And when he’s not creating the news, he tends to be in it, with protagonists using the courts, media and social media to deliver financial as well as death threats.

They say that news is something that someone, somewhere, wants kept quiet.   Cam Slater doesn’t do quiet, and as a result he is a polarising, controversial but highly effective journalist that takes no prisoners.

He is fearless in his pursuit of a story.

Love him or loathe him.  But you can’t ignore him.